Skip to content

Sectors & YAML

Every Queria company belongs to a sector (sector) and optionally to a sub-sector (subSector). The configuration lives in versioned YAML files in the repository, describing the semantic domain of the sector: which entities exist, how they are called in natural language, which measures to aggregate, which access rules to apply.

From v3.5.0 these YAMLs are much richer: they include tags, domain terms, dimensions with aliases, measures, hierarchies and confidentiality policies.

Available sectors

Queria includes 26+ out-of-the-box sectors:

CategorySectors
Professional servicesLEGAL, FINANCE, INSURANCE, BANKING, HR
HealthcareHEALTHCARE, PHARMA, RESEARCH
IndustryMANUFACTURING, AUTOMOTIVE, ENERGY, ENGINEERING, CONSTRUCTION
Consumer goodsFOOD, ECOMMERCE
TransportLOGISTICS, TELECOM
PublicGOVERNMENT, EDUCATION, NONPROFIT
ServicesHOSPITALITY, TOURISM, SUPPORT, MEDIA, REAL_ESTATE, AGRICULTURE
GenericCONTRACT, TECHNICAL, GENERIC, CUSTOM

Each sector has a YAML file in backend/src/config/sectors/<sector>.sector.yml and is loaded at startup by the sector-loader.

Anatomy of a sector YAML

Example (excerpt from legal.sector.yml):

yaml
sector: LEGAL
name: "Legale"
description: "Contracts, judgments, regulations, legal acts"

typicalConcepts:
  - tipo_documento
  - parti
  - giurisdizione
  - data
  - materia
  - articolo
  - clausola

typicalQueries:
  - "Find contracts with company X"
  - "Which judgments relate to condominiums?"
  - "What does article 5 of the contract say?"
  - "Privacy regulation"

relevantMetadataFields:
  - document_type
  - parties
  - jurisdiction
  - date
  - matter
  - court

externalSources:
  - legal-sources

domainTerms:
  - contratto
  - sentenza
  - clausola
  - articolo
  - giurisdizione
  - parte
  - attore
  - convenuto

dimensions:
  - name: document_type
    type: string
    aliases: ["tipo documento", "tipo atto", "tipologia"]
    hierarchy: [category, specific_type]
  - name: parties
    type: string[]
    aliases: ["parti", "soggetti", "attore", "convenuto"]
  - name: date
    type: date
    aliases: ["data", "anno", "periodo"]

measures:
  - name: count
    type: count
    description: "Element count"
  - name: distribution
    type: group_by
    description: "Distribution by dimension"

accessPolicy:
  topicIsolation: true
  confidentialityRules:
    - role: READER
      exclude: ["CONFIDENTIAL", "RESTRICTED"]
    - role: EDITOR
      exclude: ["RESTRICTED"]
    - role: ADMIN
      exclude: []

yamlVersion: 1

Sector fields

Identity

FieldTypeDescription
sectorenumUnique code (uppercase, e.g. LEGAL)
namestringDisplay name (e.g. "Legale")
descriptionstringShort description shown in the company UI

Natural language

FieldTypeUse
typicalConcepts[]string[]Concepts that recur in questions, used by the planner
typicalQueries[]string[]Typical questions, used as classifier prompt
domainTerms[]string[]Sector vocabulary: boosts chunks containing them and improves metadata extraction

Tags and domain terms boosted in v3.5.0

domainTerms are now also stamped on Qdrant chunks as a top-level payload (domainTerms) via the sector-fields backfill. This speeds retrieval filters ~5x and improves the relevance booster.

Dimensions and measures

dimensions[] -- semantic attributes worth filtering, grouping, comparing on:

  • name: canonical name (e.g. parties)
  • type: string | string[] | number | date
  • aliases[]: natural-language synonyms (e.g. ["parti", "soggetti", "attore", "convenuto"])
  • hierarchy[]: optional hierarchies (e.g. [category, specific_type] for drill-down)

measures[] -- applicable aggregations:

  • type: count | sum | avg | min | max | group_by

The planner LLM uses dimensions and measures to build structured queries when the user asks for aggregations ("how many contracts in 2025?", "average amount per category?").

External sources

externalSources[] -- lists external microservices relevant for the sector. Examples:

  • LEGAL -> legal-sources (Normattiva)
  • FOOD -> food-sources (Open Food Facts)
  • PHARMA -> pharma-sources (FDA, ClinicalTrials)
  • TOURISM -> consumes open-data via AI Constructor

Access policy

Role-based confidentiality policies:

yaml
accessPolicy:
  topicIsolation: true
  confidentialityRules:
    - role: READER
      exclude: ["CONFIDENTIAL", "RESTRICTED"]

Meaning: a user with role READER cannot see chunks with confidentiality in CONFIDENTIAL or RESTRICTED. Policies apply to Qdrant filtering at runtime, independently of topics.

Sub-sectors

A sector can have sub-sectors (subSectors) that inherit the base and apply deltas:

yaml
subSectors:
  contractDrafting:
    description: "Contract drafting and review"
    dimensions:
      - name: clause_category
        type: string
        aliases: ["clause category", "clause type"]

Sub-sectors are useful for companies that share the base sector but have specializations: e.g. one "tax" and one "labor" law firm both have LEGAL sector but different sub-sectors with finer dimensions.

In DB every sub-sector is a SubSector row; assignment to the company is via Company.subSectorId. Without assignment, sector routing does not kick in and Qdrant chunks are not enriched with sector fields.

Assigning a sector to a company

From the admin panel:

  1. Companies > [company] > Sector -> select the sector from the dropdown.
  2. Optional: choose a sub-sector.
  3. Save. From now on, newly ingested documents will be stamped with sector fields.

For pre-v3.5.0 companies:

sql
UPDATE "Company"
SET "subSectorId" = '<id-from-SubSector>'
WHERE id = '<companyId>';

Then run backfills to enrich existing chunks (see Ingestion DSL > Reingest).

Editing a YAML

*.sector.yml files are repo-versioned. To edit them:

  1. Open a PR on backend/src/config/sectors/<sector>.sector.yml.
  2. Schema validated by Zod (_schema.ts): CI build fails if invalid.
  3. Bump yamlVersion by 1.
  4. Deploy: at boot, sector-loader reloads YAMLs, sector-seeder updates Sector and SubSector in DB.

Backfill after edit

If you add/rename domainTerms, consider rerunning the sector-fields backfill on the sector's companies, otherwise existing chunks won't have the new terms in payload.

Custom sector per tenant

For tenants that don't fit the 26 bundled sectors:

  • Use sector: CUSTOM with an internal YAML.
  • Configure dimensions/measures/domain terms specific to the customer domain.
  • Documents are still indexed and searched normally.

UI-based custom sector creation is on the V2 roadmap (no more YAML PRs).

Company tags (boosted in v3.5.0)

Regardless of sector, every company has a tags[] field that works as free labels:

  • Visible in the company sheet.
  • Filterable in cross-tenant searches (SYSTEM_ADMIN only).
  • Used by the planner to disambiguate behavior when sector alone isn't enough (e.g. "company tagged enterprise" -> use enterprise bot template).

Add tags in the Company > Tags sheet, separated by enter or tab.

Company domain terms

Beyond the sector domainTerms, every company can add company-specific domain terms from its sheet:

  • Internal acronyms (e.g. MOL, EBITDA-adj).
  • Customer product names.
  • Company terminology glossary.

These are merged with the sector ones when evaluating chunk relevance, giving an explicit boost to the company lexicon.


Queria v3.5.0 -- Boosted Sector YAMLs

Queria - Document Intelligence con Cog-RAG