Modern data is too big and complex to analyze in its native form. Data simplification is the process whereby large and complex data is rendered usable. Complex data must be simplified before it can be analyzed, but the process of data simplification is anything but simple, requiring a specialized set of skills and tools. Data Simplification: Taming Information with Open Source Tools provides data scientists, from every scientific discipline, with the methods and tools to simplify their data for immediate analysis or for long-term storage, in a form that can be readily repurposed or integrated with other data. Drawing upon years of practical experience and using numerous examples and use cases, Jules Berman discusses: Principles, methods and tools that must be studied and mastered to make achieve data simplification.Open Source tools free utilities and snippets of code that can be reused and repurposed to simplify data.Natural language processing and machine translation as a tool to simplify data.Data summarization and visualization and the role they play in making data useful for the end user. Discusses Data simplification principles, methods and tools that must be studied and masteredProvides open source tools, free utilities and snippets of code that can be reused and repurposed to simplify dataExplains how to best utilize indexes to search, retrieve, and analyze textual dataShows the Data Scientist how to apply ontologies, classifications, classes, properties, and instances to data using Presents practical examples INDICE: Structuring Text Indexing and Annotating Data Data Summarization and Visualization Identifying Data Giving Data Meaning and Expressing Data Relationships Smart Data: Introspection, Reflection, and Integration Data Reduction and Transformation
- ISBN: 978-0-12-803781-2
- Editorial: Morgan Kaufmann
- Encuadernacion: Rústica
- Páginas: 404
- Fecha Publicación: 08/04/2016
- Nº Volúmenes: 1
- Idioma: Inglés