🐝Language, terms and buzzwords

TLDR: Enterprise data management and data analysis have their own set of words that require understanding.

Clarista has been designed to enable both the data experts and the data consumers.

  • Data experts leverage Clarista to publish data from multiple sources, manage data for clarity and quality, and produce advanced analytics for business questions and needs. Examples of data experts include data analysts, data engineers, data management team, data scientists and business intelligence experts.

  • Data consumers leverage Clarista to find, access, explore and analyze insights produced by data experts. Examples of data consumers include sales, marketing, operations, compliance, finance, risk and product teams

Below you can find a list of some terminology you may come across in this documentation. The terms are ordered by popularity and not in alphabetical order.

  1. GenAI - is a field of artificial intelligence that is capable of generating text, images, videos, or other data using generative models, often in response to user prompts.

  2. LLM - are Large Language models trained on millions to billions of documents and web articles for GenAI to work

  3. Natural Language Queries - represent business questions in natural spoken language in contrast to Technical Queries, that are written by technical experts as a code.

  4. TALKdata - is a capability of Clarista where users can get real-time answers to their business questions.

  5. Reinforcement Learning - is a specialized field of Artificial Intelligence, where the model learns based on user's or machine's feedback. Clarista leverages this technique to refine the interpretation of user's business questions and to improve the accuracy of the answers.

  6. Trackers - are user configured charts, metrics and data tables using natural language questions.

  7. Data Fabric - is a platform independent technical architecture that facilitates end to end automation of data processes from data sourcing to analyzing data. Such an architecture enables enterprises to achieve their data management and analytics objectives much faster at a lower cost.

  8. Semantics Layer - is a business representation of technical data, organized for a business purpose, with data definitions. Semantics Engine enables automated translation of technical to business data at run time.

  9. Data as a Product (DaaP) - is the practice of making data available to the business that is relevant, discoverable, self-describing, accessible, trustworthy and secured.

  10. Data Domains - are logical groupings (think folders or workspaces) under which different Data Products can be organized by business functions (e.g., sales, finance etc.)

  11. Data Pods - are business ready data products organized within different Data Domains. Clarista Data Pods follow the principles of Data as a Product (DaaP) described above.

  12. Data Discovery - Set of capabilities necessary for data consumers to find role-relevant data, access it, understand it and get quick (and accurate) answers to their questions.

  13. Data Dictionary - provides business definitions and additional details that together are called β€˜meta-data’ of organizational data stored in multiple technical platforms and systems.

  14. Data Profile - quantify the structure of data using statistical and engineering methods. Outputs of data profile can include data classifications, data type, missing values, range of values, distribution of data values etc.

  15. Data Exploration - provides the Excel like capabilities to analyze data. These capabilities include search, filter, sort and query data.

  16. Data Catalog - provides a mechanism for Data Discovery & Exploration, bringing multiple capabilities together such as search, data dictionary, data profile and interactive data exploration.

  17. Data Analysis - constitutes multiple methods to analyze data. These methods include interactive reports, dashboards, natural language queries, data transformation, personalized alerts and AI/ML advanced insights.

  18. Data Visualization - provides the capabilities to visualize data through a rich library of charts with options to analyze scenarios, apply filters and drill-through the charts to underlying data.

  19. Data Flows - provides the ability for data analysts and data engineers to transform raw data based on business needs and for data scientists to apply proprietary AI/ML models to produce advanced insights.

  20. Data Alerts - provides the mechanism to monitor critical business performance indicators (KPIs) and data quality.

  21. Scheduler - provides the flexibility to automate any Data Flows and Data Alerts.

  22. Administration - helps SuperUser role to configure data connectivity, data domains and user groups, users and roles for Clarista.

  23. Connections - provide the flexibility to connect to multiple cloud and on-premises data sources. Once configured, these connections are used to configure Data Pods within Data Domains without moving any data.

  24. User Groups - provide the flexibility to group users based on role and responsibilities, and entitle them to different Data Domains within Clarista.

  25. Data Usage - provides transparency on every data process and data query executed within Clarista by any user of the system.

  26. Data Lineage - tracks dependencies of any data or analytics configured within Clarista.

Last updated