Note: This page is written by Wikidata.org.
Proper Etiquette for Using Wikidata in Exploratory Systems
Wikidata is an open, community-maintained knowledge base, and it is designed to be queried programmatically. That said, systems that rely on it—especially exploratory or agent-driven systems—should follow a few basic norms to remain reliable, respectful, and sustainable.
1. Always Identify Your Client
Wikidata expects automated clients to identify themselves via a meaningful User-Agent. Anonymous or generic agents are often rate-limited or blocked.
Best practice:
- Include a project name
- Include a project URL
- Include a contact email or page
Example:
User-Agent: MapRockExplorerSubgraph/0.1 (https://github.com/MapRock/IntelligenceBusiness; contact: you@example.com)
This isn’t just etiquette—it’s how Wikidata operators distinguish responsible automation from abuse.
2. Cache Aggressively
Wikidata is not meant to be your runtime lookup engine.
If you resolve a label to a QID once, you should:
- Cache it locally (in memory)
- Persist it (file, database, graph node)
- Reuse it across runs
In an Exploration Subgraph, identity resolution is a one-time cost, not something to repeat on every traversal.
If your system keeps asking Wikidata the same question, it’s a design smell.
3. Expect Partial or Ambiguous Results
Wikidata is broad, not perfect.
- Some labels resolve cleanly (e.g., corn → Q23148)
- Others may be ambiguous, underspecified, or absent
- Search results may reflect dominant cultural or economic usage
Your system should treat Wikidata resolution as grounding, not truth.
In the Exploration Subgraph:
- Missing QIDs are acceptable
- Provisional IRIs (project-local) are acceptable
- Resolution can happen later
4. Rate and Batch Thoughtfully
Wikidata’s APIs are shared infrastructure.
Best practices:
- Avoid tight loops with per-item calls
- Add small delays if doing bulk resolution
- Prefer batch or SPARQL queries when appropriate
For ES construction:
- Ground after hypothesis generation
- Ground only what survives pruning
- Don’t ground speculative branches you won’t explore further
5. Treat Wikidata as Reference, Not Authority
Wikidata already contains “has use,” “used for,” and “part of” relationships.
Those are descriptive facts, authored and curated.
The Exploration Subgraph, by contrast:
- Proposes hypotheses
- Records tentative roles
- Supports navigation, not classification
So:
- Link to Wikidata entities
- Do not copy Wikidata assertions wholesale
- Do not overload Wikidata predicates with exploratory meaning
Wikidata anchors what something is known to be.
The Exploration Subgraph explores what something might be involved in.
6. Respect Community Boundaries
If your system ever:
- Writes back to Wikidata
- Proposes automated edits
- Uses Wikidata at very large scale
…then you must engage with the Wikidata community directly.
For read-only exploratory systems, the practices above are sufficient and expected.