About the platform
Company intelligence, straight from the source.
EDGAR Insights turns a stock ticker into a structured, source-linked regulatory & ESG profile of a public company — extracted live from U.S. SEC filings and analyzed with AI. Every data point carries a confidence level and links back to the exact filing it came from.
What it does
Enter a ticker and EDGAR Insights produces 12 standardized data points covering a company's corporate identity, scale, operating footprint, industry, products, and environmental / supply-chain risk profile. It reads the company's most recent annual report directly from the SEC, pulls structured financial facts, and uses AI to extract the answers that aren't available as tidy fields — then scores how confident it is in each one and cites the source.
How to use it
- Search a ticker. Start typing a ticker or company name. A live dropdown confirms the company — name, exchange, and SEC CIK — so you know you've got the right entity.
- Generate. Click Generate. The first time a company is requested, extraction takes roughly 20–40 seconds (downloading and analyzing the latest annual report). Profiles already in the database load instantly.
- Read the profile. A company header (identity, sector, filing links) sits above 12 cards. Each card shows the answer, a High Medium Low confidence pill, a verbatim evidence quote where the answer came from AI, and links to the underlying SEC source.
- Export. Use Download JSON or Copy JSON at the bottom of any profile to get the full machine-readable result — values, confidence, and sources.
- Regenerate. A cached profile shows a Regenerate button to force a fresh extraction from the latest filing.
The 12 data points
Each answer is tagged by how it's derived: SEC structured comes straight from SEC fields or XBRL financial facts; AI-analyzed is read from the filing text by the AI model and quoted as evidence.
- 1SEC structuredCountry (and U.S. state) of incorporationFrom the SEC submissions record.
- 2SEC structuredStock-exchange listing(s)Exchanges and tickers from SEC data.
- 3SEC structuredEntity typePublic company; bank/insurer/fund tags inferred from SIC classification.
- 4SEC structuredApproximate global annual revenue (USD)Latest fiscal-year revenue from XBRL company facts.
- 5XBRL / AIGlobal employee headcountFrom XBRL where tagged, else the company-wide total in the filing text.
- 6AI-analyzedCountries & U.S. states of operationsFrom the business description and Exhibit 21 (subsidiaries).
- 7AI-analyzedRegions of significant revenueGeographic revenue breakdown with shares where stated.
- 8SEC + AIPrimary industry sector (GICS)Mapped from SIC code and cross-checked by AI.
- 9AI-analyzedPhysical products & categoriesWhether the company makes/sells physical goods, and which categories.
- 10AI-analyzedEnvironmental / sustainability claimsWhether the filing makes sustainability claims.
- 11AI + sectorScope 1+2 GHG / energy intensityHigh / Moderate / Low, from disclosures and a sector rule.
- 12AI-analyzedOperations / supply-chain risk factorsResource extraction, deforestation commodities, hazardous substances, conflict minerals, labor risk, water use, and more.
Where the data comes from
All data comes from the U.S. Securities & Exchange Commission's EDGAR system — there are no other external data providers. Specifically:
- Company & ticker directory — to resolve a ticker to a company and its CIK.
- Submissions record — identity, exchanges, SIC code, addresses, and the index of recent filings.
- XBRL company facts — structured financials, used for revenue and (when tagged) employee count.
- The latest annual report — Form 10-K (U.S. companies) or 20-F / 40-F (foreign issuers). The full document text is analyzed.
- Exhibit 21 — the list of subsidiaries and their jurisdictions, used for the operations footprint.
- Form SD — a structural signal that a company files conflict-minerals disclosures.
Foreign filings are supported (e.g. 20-F), and a translation path is built in for future non-English / EU sources.
How it works
For each request the engine:
- resolves the ticker to a CIK and fetches the submissions record and XBRL facts in parallel;
- locates and downloads the most recent annual report and its Exhibit 21;
- derives the structured answers (incorporation, listings, entity type, revenue, sector) directly from SEC fields;
- runs targeted AI analyses over keyword-selected excerpts of the filing — workforce, operating footprint, revenue geography, products, and ESG / supply-chain risk — using Cloudflare Workers AI (Llama 3.3 70B);
- scores each answer's confidence and assembles the 12 results.
Every result is saved to a private database so repeat requests are instant and can be validated over time (see the admin section).
Confidence scoring
Every data point is rated so you know how much to trust it — and can always verify via the linked source.
| Level | What it means |
|---|---|
| High | A structured SEC field, an XBRL financial fact, or two independent methods that agree. |
| Medium | Read from the filing text by AI with a verbatim supporting quote, or parsed by a targeted text scan. |
| Low | An inference without direct evidence (e.g. sector-based), or not determinable from the filing. |
Exports
Any profile can be exported as JSON — the complete machine-readable result, including every value, its confidence level and basis, and the list of sources. Use Download JSON to save a file or Copy JSON to copy it to your clipboard. (Administrators can also export many companies at once as CSV — see below.)
Accuracy & limitations
- This is an informational research tool, not investment advice.
- AI extraction can be incomplete or wrong — always verify important figures against the linked filing.
- Coverage is limited to companies that file with the SEC (U.S. companies and foreign issuers). Private companies and non-SEC entities aren't covered.
- Answers reflect the most recent annual report; they're refreshed automatically as new filings appear.
The tools below are part of the token-protected admin console. They aren't accessible to general users and require an admin access token — everything above is all a standard user needs.
Admin console
The console lives at /admin and is gated by a bearer token. Enter the token once and it's
stored in your browser for subsequent visits ("Log out" clears it). The top of the console shows live
stats: total companies, stored extractions, total requests, requests in the last 24 hours, and when the
data was last refreshed.
Companies table
The Companies tab lists every company in the database — one row each — populated from the most recent validated extraction. It's dense by design, surfacing both structured and AI-derived data points side by side: incorporation, revenue, employees, whether it makes physical products, sustainability claims, emissions intensity, a count of flagged supply-chain risks, an overall confidence indicator, the record's status (active or degraded), and when it was last updated.
- Click any row to open the full stored profile — the exact same view as the public Generate page, populated from stored data with no re-extraction. From there you can Regenerate (force a fresh run), or Download / Copy its JSON.
- Search by ticker or name, and Refresh an individual company on demand.
Bulk CSV export
Tick rows (or use the header checkbox to select all) and click Export CSV for a wide, analysis-ready file — one row per company with 45 columns: identity, all 12 data points (including numeric revenue and employee counts), a confidence level for each data point, and provenance (status, generated-at, source filing, model, and engine version). The file is RFC-4180 quoted and UTF-8 encoded for clean import into Excel or any data tool.
The database
Every request and result is logged to a private, API-accessible Cloudflare D1 (SQLite) database with four tables:
| Table | Holds |
|---|---|
companies | One row per tracked company — identity, status, and latest filing reference. |
extractions | Every generated profile, stored as full JSON and timestamped. |
request_log | Every API request — ticker, endpoint, status, cache hit, duration, and country. |
validation_log | Every refresh/validation outcome and which fields changed. |
Scheduled validation & refresh
A scheduled worker (edgar-refresh) runs daily at 06:17 UTC to keep stored
data current:
- It re-checks companies whose data is stale (older than 7 days) or was generated in a degraded state, prioritizing the degraded ones.
- If the latest filing is unchanged, the record is marked validated-current with no re-run; a new filing triggers a full re-extraction. Differences are diffed and logged.
- Self-healing: if a profile was ever generated while the filing text or the AI service was unavailable, it's flagged, kept out of the serve-cache, and automatically re-run on the next cycle.
- You can also trigger an on-demand batch (Refresh tab) or refresh a single company (row button).
Admin API
Everything in the console is backed by a JSON API. The public endpoints are
GET /api/search and POST /api/generate. Admin endpoints live under
/api/admin/* and require an Authorization: Bearer <token> header:
stats,profiles,companies,requests,extractions(list, or one with full JSON byid),validationsPOST refresh— force a single company ({"ticker":"AAPL"}) or run a stale-batch ({})health— a dependency probe of the database, cache, AI models (primary + fallback), and SEC reachability
# Example: pull a company's full stored profile curl -H "Authorization: Bearer $ADMIN_TOKEN" \ "https://edgar.lukewade.net/api/admin/profiles?q=apple"