
What is Lilac
Lilac Garden is a powerful tool designed for data exploration and quality control, particularly for large language models (LLMs). It offers features like clustering, semantic and keyword search, editing and comparing fields, and detecting PII, duplicates, and custom signals. With its ability to process datasets at high speeds, Lilac Garden accelerates data transformations and helps users understand and refine their datasets effectively.
How to Use Lilac
- Install Lilac using Python.
- Use the user interface to explore and edit your datasets.
- Perform clustering, semantic search, and other data transformations to refine your data.
Features of Lilac
-
Clustering
Cluster and title 1 million data points in 20 minutes.
-
Semantic & Keyword Search
Perform advanced searches to find relevant data points.
-
Edit & Compare Fields
Easily edit and compare different fields within your dataset.
-
PII, Duplicates, Language Detection
Detect personally identifiable information, duplicates, and language within your dataset.
-
Fuzzy-Concept Search with Refinement
Refine your search with fuzzy-concept matching to find the most relevant data.
-
Blazing Fast Dataset Computations
Embed your dataset at half a billion tokens per minute.