THE DCIPHER ANALYTICS STUDIO
A SaaS solution that perfects human-computer collaboration for augmented text analytics
Harnessing the power of natural language processing, deep learning, and visual analytics to extract value from text
Comprehensive data import options
Import data from a range of file formats, or one of the most extensive social/news media archives on the internet. Segment, extract and score text units by relevance.
Interactive exploration
Let topics and associations emerge bottom-up and visualize relationships within the data. Iterate with the help of immediate visual feedback. Use document landscapes and token networks to find meaning in large document collections.
Parallelized computation
Speed up the process by leveraging computing that is distributed and fully parallelized in the cloud. Scale up to the number of cores you need to get the job done as fast as you want.
Smart data prep
Merge flat or nested datasets. Standardize inconsistent date formats and use fuzzy matching to merge similar texts. Choose among a number of data cleaning options and resolve ambiguities through visual context filtering.
The best of NLP
Enrich your data through world-class sentiment, emotion, entity, category, stance and subjectivity detection models. Find language independent document and word embeddings using neural networks to capture contextual similarities in your data.
Efficient modeling
Train text classifiers through an iterative technique using a combination of supervised and unsupervised machine learning. Save and run your models on unannotated data.
Flexible data structure
Leverage Dcipher’s flexible, nested data structure which incorporates the output of operations. Use data on different levels of the hierarchy without the need for transforming or keeping track of relations between datasets.
Trend & burst analysis
Detect trends and sudden bursts in your data. Identify topics with high momentum and visualize their evolution over time. Forecast future trends using the latest machine learning techniques.
Automation & deployment
Build automated workflows and apply your models and operation pipelines on data in streams or batches. Access through Dcipher APIs for use in external applications and dashboards.
Features
Data import and export
- Import data in various flat and nested formats, including JSON, CSV, TSV, Excel, Word, RSS, PDF and plain text
- Import data from social media channels, news websites, and survey collectors
- Download data as a file in various formats or download visualizations
- Export data to APIs, RSS feeds, Miro or Dashboards
Text wrangling and cleaning
- Sample and shuffle data
- Join datasets, both flat and nested
- Clean text off emojis, URLs, line breaks, XML tags, tabs, punctuation, and any user-specified prefix or substring
- Fix spelling errors
- Standardize date formats
- Remove or extract duplicates or near-duplicates
- Segment texts into shorter, cohesive text snippets
- Split texts by pattern
- Extract patterns and substrings from text
- Extract dates from text
- Filter on the same level or across levels of nested data structure
Natural language processing and understanding
- Tokenize, lemmatize and remove stop words from text
- Tag words with their parts-of-speech
- Detect the language of texts and automatically translate them into another language
- Find topics and overrepresented words in a set of documents
- Enrich texts by automatically annotating them with categories, entities, sentiment, emotions, subjectivity and stance
Natural language generation
- Instruction-based question answering
- Generative text classification
- Conditional text generation for topic summarization
Quantification and analysis
- Quantify the occurrence of words
- Quantify the length of texts in terms of the number of characters, words, sentences, and paragraphs
- Group, aggregate, and run complex functions on data
- Tag and annotate individual or groups of documents
- Detect links between values and visualize them as a network
- Time-series momentum and burst analysis on topics and documents
Search and mapping
- Search for words and texts based on search criteria or contextual similarity
- Cluster documents based on semantic similarity into a document landscape
- State-of-the-art ML models for vectorization, dimensionality reduction, outlier detection and clustering of documents and words
- State-of-the-art ML models for supervised, semi-supervised or zero-shot classification of documents
- 10+ workbenches to visually inspect and draw conclusions from data