SEMANTIC BASED DATA STORAGE WITH NEXT GENERATION CATEGORIZER

Full Text Download |
Abstract
-The namespace management is based on hierarchical directory trees. This tree-based namespace
scheme is prone to severe performance bottlenecks and often fails to provide real time response to complex data
lookups. This paper proposes a semantic-aware namespace scheme, called sane, which provides dynamic and
adaptive namespace management for ultra-large storage systems with billions of files. Associative access on the
files is provided by an initial extension to existing tree structured file system protocols, and by the use of these
protocols that are designed specifically for content based file system access. Access on the file details such as
versions or any other concepts were interpreted as queries applied on our container engine, and thus provides
flexible associative access to files. Indexing of key properties of file system objects and indexing/ caching on
the file system is one of the fantastic features of our system. The automatic indexing of files and grouped based
on relativity is called “semantic” because user programmable nature of the system uses information about the
semantics of updated file system objects to extract the properties for indexing. The semantic correlations and file
groups identified in sane can also be used to facilitate file perfecting and data de-duplication, among other
system-level optimizations.
Index Terms: File systems, storage systems, semantic awareness, namespace management
I.Introduction
The main aim of this project is storing the data in a custom repository in the file system. The data will be
compressed and it will be stored in the container and retrieving the data faster the Indexing of data inside the
container. Restring data access on the container and providing security and access rights on the documents in the
container and data stored in a grouped manner based on the keywords and metadata information about the
document. The scope of this project is to indexing the data in the container and Version management of the data
in the container. Easy access of data in container
SYNOPSISFast and flexible metadata retrieving is a critical requirement in the next-generation data storage systems serving
high-end computing. As the storage capacity is approaching Exabyte and the number of files stored is reaching
billions, directory-tree based metadata management widely deployed in conventional file systems can no longer
meet the requirements of scalability and functionality. Although existing distributed database systems can work
well in some real-world data-intensive applications, they are inefficient in very large-scale file systems due to
four main reasons. First, as the storage system is scaling up rapidly, a very large-scale file system, the main
concern of this paper, generally consists of thousands of server nodes, contains trillions of files, and reaches
Exabyte-data-volume (EB). Unfortunately, existing distributed databases fail to achieve efficient management of
petabytes of data and thousands of concurrent requests. In the next-generation file systems, metadata accesses
will very likely become a severe performance bottleneck as metadata-based transactions not only account for
over 50 percent of all file system operations but also result in billions of pieces of metadata in directories. While
a high-end or next-generation storage system can provide a Petabyte-scale or even Exabyte-scale storage
capacity containing an ocean of data, what the users really want for their applications is some knowledge about the data’s behavioral and structural properties. In real-world applications, cache-based structures have proven to
be very useful in dealing with indexing among massive amounts of data. However, traditional temporal or
spatial (or both) locality-aware methods alone will not be effective to construct and maintain caches in largescale
systems to contain the working data sets of complex data-intensive applications Semantic correlation
comes from the exploitation of high-dimensional attributes of metadata. The main benefit of using semantic
correlation is the ability to significantly narrow the search space and improve system performance. Here, we
propose a novel decentralized semantic ware metadata organization, called Smart Store to effectively exploit
semantic correlation to enable efficient complex queries for users and to improve system performance in realworld
applications
References:
- R. N. Rodrigues, L. L. Ling, and V. Govindaraju, “Robustness of multimodal biometric fusion methods against spoof attacks,” J.Vis. Lang. Comput., vol. 20, no. 3, pp. 169–179, 2009.
- P. Johnson, B. Tan, and S. Schuckers, “Multimodal fusion vulnerability to non-zero effort (spoof) imposters,” in IEEE Int’l Workshop on Inf. Forensics and Security, 2010, pp. 1–5.
- P. Fogla, M. Sharif, R. Perdisci, O. Kolesnikov, and W. Lee, “Polymorphic blending attacks,” in Proc. 15th Conf. on USENIXSecurity Symp. CA, USA: USENIX Association, 2006.
- G. L. Wittel and S. F. Wu, “On attacking statistical spam filters,” in 1st Conf. on Email and Anti-Spam, CA, USA, 2004.
- D. Lowd and C. Meek, “Good word attacks on statistical spam filters,” in 2nd Conf. on Email and AntiSpam, CA, USA, 2005.
- C. Upendra ,G.Gopichand,” Survivability and Protection of Nodes and Links from Failures in WDM Mesh Networks”,vol: 2-1 (Pages 51-59), ), ISR Journal, Available:http://isrjournals.org/archives_abstract.php?id=28&t_n=ijarcseit&d_id=66&dm
- K Munivara Prasad,” Comparative Analysis of Performance of Ad-Hoc Wireless Routing Protocols Based On Topology Using Qualnet”,vol: 1-1 (Pages 14-19), ), ISR Journal,Available:http://isrjournals.org/ archives _ abstract.php?id=28&t_n=ijarcseit&d_id=66&dm