SLGP Header

SEMANTIC BASED DATA STORAGE WITH NEXT GENERATION CATEGORIZER

IJCSEC Front Page

Abstract
-The namespace management is based on hierarchical directory trees. This tree-based namespace scheme is prone to severe performance bottlenecks and often fails to provide real time response to complex data lookups. This paper proposes a semantic-aware namespace scheme, called sane, which provides dynamic and adaptive namespace management for ultra-large storage systems with billions of files. Associative access on the files is provided by an initial extension to existing tree structured file system protocols, and by the use of these protocols that are designed specifically for content based file system access. Access on the file details such as versions or any other concepts were interpreted as queries applied on our container engine, and thus provides flexible associative access to files. Indexing of key properties of file system objects and indexing/ caching on the file system is one of the fantastic features of our system. The automatic indexing of files and grouped based on relativity is called “semantic” because user programmable nature of the system uses information about the semantics of updated file system objects to extract the properties for indexing. The semantic correlations and file groups identified in sane can also be used to facilitate file perfecting and data de-duplication, among other system-level optimizations.
Index Terms: File systems, storage systems, semantic awareness, namespace management
I.Introduction
The main aim of this project is storing the data in a custom repository in the file system. The data will be compressed and it will be stored in the container and retrieving the data faster the Indexing of data inside the container. Restring data access on the container and providing security and access rights on the documents in the container and data stored in a grouped manner based on the keywords and metadata information about the document. The scope of this project is to indexing the data in the container and Version management of the data in the container. Easy access of data in container
SYNOPSISFast and flexible metadata retrieving is a critical requirement in the next-generation data storage systems serving high-end computing. As the storage capacity is approaching Exabyte and the number of files stored is reaching billions, directory-tree based metadata management widely deployed in conventional file systems can no longer meet the requirements of scalability and functionality. Although existing distributed database systems can work well in some real-world data-intensive applications, they are inefficient in very large-scale file systems due to four main reasons. First, as the storage system is scaling up rapidly, a very large-scale file system, the main concern of this paper, generally consists of thousands of server nodes, contains trillions of files, and reaches Exabyte-data-volume (EB). Unfortunately, existing distributed databases fail to achieve efficient management of petabytes of data and thousands of concurrent requests. In the next-generation file systems, metadata accesses will very likely become a severe performance bottleneck as metadata-based transactions not only account for over 50 percent of all file system operations but also result in billions of pieces of metadata in directories. While a high-end or next-generation storage system can provide a Petabyte-scale or even Exabyte-scale storage capacity containing an ocean of data, what the users really want for their applications is some knowledge about the data’s behavioral and structural properties. In real-world applications, cache-based structures have proven to be very useful in dealing with indexing among massive amounts of data. However, traditional temporal or spatial (or both) locality-aware methods alone will not be effective to construct and maintain caches in largescale systems to contain the working data sets of complex data-intensive applications Semantic correlation comes from the exploitation of high-dimensional attributes of metadata. The main benefit of using semantic correlation is the ability to significantly narrow the search space and improve system performance. Here, we propose a novel decentralized semantic ware metadata organization, called Smart Store to effectively exploit semantic correlation to enable efficient complex queries for users and to improve system performance in realworld applications

References:

  1. R. N. Rodrigues, L. L. Ling, and V. Govindaraju, “Robustness of multimodal biometric fusion methods against spoof attacks,” J.Vis. Lang. Comput., vol. 20, no. 3, pp. 169–179, 2009.
  2. P. Johnson, B. Tan, and S. Schuckers, “Multimodal fusion vulnerability to non-zero effort (spoof) imposters,” in IEEE Int’l Workshop on Inf. Forensics and Security, 2010, pp. 1–5.
  3. P. Fogla, M. Sharif, R. Perdisci, O. Kolesnikov, and W. Lee, “Polymorphic blending attacks,” in Proc. 15th Conf. on USENIXSecurity Symp. CA, USA: USENIX Association, 2006.
  4. G. L. Wittel and S. F. Wu, “On attacking statistical spam filters,” in 1st Conf. on Email and Anti-Spam, CA, USA, 2004.
  5. D. Lowd and C. Meek, “Good word attacks on statistical spam filters,” in 2nd Conf. on Email and AntiSpam, CA, USA, 2005.
  6. C. Upendra ,G.Gopichand,” Survivability and Protection of Nodes and Links from Failures in WDM Mesh Networks”,vol: 2-1 (Pages 51-59), ), ISR Journal, Available:http://isrjournals.org/archives_abstract.php?id=28&t_n=ijarcseit&d_id=66&dm
  7. K Munivara Prasad,” Comparative Analysis of Performance of Ad-Hoc Wireless Routing Protocols Based On Topology Using Qualnet”,vol: 1-1 (Pages 14-19), ), ISR Journal,Available:http://isrjournals.org/ archives _ abstract.php?id=28&t_n=ijarcseit&d_id=66&dm