Back to Blog
TutorialMarch 24, 2026·7 min read

Extract Structured Data from Unstructured Text with One API Call

AC

Alex Chen

Developer Advocate

Share:

The Unstructured Data Problem

80% of business data is unstructured — emails, PDFs, chat logs, support tickets. Extracting structured information from this text traditionally requires complex regex patterns, NLP pipelines, or manual data entry. A data extraction API handles all of this in a single call.

Quick Start

const API_URL = "https://instantapis.net/api/v1/generate";

async function extractData(text, fields) {
  const response = await fetch(API_URL, {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      task: "extract",
      input: text,
      options: { fields },
    }),
  });
  return response.json();
}

Example 1: Extract Contact Information

const email = `
Hi, I'm John Smith from Acme Corporation. You can reach me at
john.smith@acme.com or call (555) 123-4567. Our office is at
123 Main Street, San Francisco, CA 94105.
`;

const result = await extractData(email, [
  "name", "company", "email", "phone", "address"
]);

// Returns:
// {
//   name: "John Smith",
//   company: "Acme Corporation",
//   email: "john.smith@acme.com",
//   phone: "(555) 123-4567",
//   address: "123 Main Street, San Francisco, CA 94105"
// }

Example 2: Parse Invoice Data

import requests

API_URL = "https://instantapis.net/api/v1/generate"

invoice_text = """
Invoice #INV-2026-0847
Date: March 15, 2026
Due: April 14, 2026

Bill to: TechCorp Inc.
         456 Oak Avenue, Austin, TX 78701

Items:
- Web Development Services: $5,000.00
- API Integration: $2,500.00
- Monthly Hosting: $200.00

Subtotal: $7,700.00
Tax (8.25%): $635.25
Total: $8,335.25
"""

response = requests.post(API_URL, json={
    "task": "extract",
    "input": invoice_text,
    "options": {
        "fields": [
            "invoice_number", "date", "due_date",
            "company", "items", "subtotal", "tax", "total"
        ]
    },
}, headers={
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json",
})

data = response.json()
print(json.dumps(data["data"], indent=2))

Example 3: Parse Resumes

const resume = `
Jane Doe
Senior Software Engineer | 8 years experience
jane.doe@email.com | San Francisco, CA

Experience:
- Lead Engineer at StartupXYZ (2023-present)
- Senior Developer at BigTech Inc (2020-2023)
- Developer at WebAgency (2018-2020)

Skills: Python, JavaScript, TypeScript, React, Node.js, PostgreSQL, AWS
Education: B.S. Computer Science, Stanford University (2018)
`;

const result = await extractData(resume, [
  "name", "title", "email", "location", "years_experience",
  "skills", "education", "work_history"
]);

Processing Documents at Scale

from concurrent.futures import ThreadPoolExecutor

def extract_from_document(doc_text):
    response = requests.post(API_URL, json={
        "task": "extract",
        "input": doc_text,
        "options": {"fields": ["name", "email", "company", "amount"]},
    }, headers={
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json",
    })
    return response.json()

# Process 100 documents in parallel
documents = ["doc text 1...", "doc text 2...", ...]

with ThreadPoolExecutor(max_workers=10) as executor:
    results = list(executor.map(extract_from_document, documents))

# All results are clean, structured JSON
for r in results:
    if r.get("success"):
        print(r["data"])

Why Use an API Instead of Regex?

ApproachHandles edge casesSetup timeMaintenance
Regex patternsPoorHours-daysHigh
spaCy/NLP pipelineModerateDays-weeksMedium
InstantAPIExcellentMinutesNone

Regex breaks when formatting changes. NLP pipelines need training data. An AI-powered extraction API handles variations, typos, and unexpected formats automatically.

Use Cases

  • CRM data entry: Auto-populate contact records from emails
  • Invoice processing: Extract line items, totals, and dates
  • Resume parsing: Structure candidate information for ATS systems
  • Legal documents: Pull key terms, dates, and parties from contracts
  • Support tickets: Extract product names, error codes, and account IDs

Each call costs $0.50 — far less than manual data entry or building custom extraction pipelines.

Start extracting data now — get 10 free credits to try it out.

Ready to try InstantAPI?

Sign up today and get 10 free credits to explore all 6 AI capabilities. No credit card required.

Get 10 Free Credits