Retrieving the Latest 10-K Report URL from SEC EDGAR

This guide outlines how to automatically retrieve the latest 10-K report link for a specific company from the SEC (U.S. Securities and Exchange Commission) EDGAR system. We will use Python’s requests and BeautifulSoup libraries to perform web scraping.

1. Overview of Searching for 10-K Reports on SEC EDGAR

SEC EDGAR is a public database that provides financial reports (10-K, 10-Q, etc.) for publicly traded U.S. companies. To search for a company’s 10-K report, use the following URL:

https://www.sec.gov/cgi-bin/browse-edgar?CIK=<ticker>&type=10-K&count=10&action=getcompany

For example, to search for Apple’s (AAPL) 10-K report, use:

https://www.sec.gov/cgi-bin/browse-edgar?CIK=AAPL&type=10-K&count=10&action=getcompany

This page contains a “Documents” link that leads to the detailed document page for the report.

2. Automatically Extracting the 10-K Report Link with Python

2.1 Install Required Libraries

First, install the necessary libraries:

pip install requests beautifulsoup4

2.2 Python Code to Retrieve the 10-K Report URL

The following code automatically finds and returns the latest 10-K report link for a specified company from SEC EDGAR.

import requests
from bs4 import BeautifulSoup

def get_latest_10k_url(ticker):
    # SEC EDGAR search page URL
    search_url = f"https://www.sec.gov/cgi-bin/browse-edgar?CIK={ticker}&type=10-K&count=10&action=getcompany"
    
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36 Edg/133.0.0.0",
        "Referer": search_url
    }

    session = requests.Session()
    session.headers.update(headers)

    # Request the SEC EDGAR search page
    response = session.get(search_url)
    soup = BeautifulSoup(response.text, "html.parser")

    # Find the "Documents" button
    doc_button = soup.find("a", string="Documents")
    if not doc_button:
        print(f"10-K report for {ticker} not found. (Documents button missing)")
        return None

    # Navigate to the "Documents" page
    docs_url = "https://www.sec.gov" + doc_button["href"]
    response = session.get(docs_url)
    soup = BeautifulSoup(response.text, "html.parser")

    # Locate the "Document Format Files" table
    table = soup.find("table", {"summary": "Document Format Files"})
    if not table:
        print(f"Document table for {ticker} not found.")
        return None

    latest_10k_url = None
    for row in table.find_all("tr"):
        cols = row.find_all("td")
        
        # Find the "10-K" document
        if len(cols) > 1 and "10-K" in cols[1].text:
            doc_link = cols[2].find("a")["href"]
            latest_10k_url = "https://www.sec.gov" + doc_link
            break

    if latest_10k_url:
        print(f"✅ Latest 10-K Report URL: {latest_10k_url}")
        return latest_10k_url
    else:
        print(f"⚠️ Latest 10-K report for {ticker} not found.")
        return None

# Retrieve the latest 10-K report URL for Apple (AAPL)
ticker = "AAPL"
latest_10k_url = get_latest_10k_url(ticker)

3. Example Output

Running the above code will produce an output similar to:

✅ Latest 10-K Report URL: https://www.sec.gov/ix?doc=/Archives/edgar/data/0000320193/000032019324000123/aapl-20240928.htm

This allows automatic retrieval of the latest 10-K report link for Apple (AAPL).

4. Potential Enhancements

This script provides basic functionality but can be extended further:

  1. Batch processing for multiple companies
  2. Automated retrieval of 10-Q (quarterly reports) and other filings
  3. NLP-based extraction of key insights from reports
  4. Selenium integration for dynamic crawling

5. Conclusion

With this method, you can automatically retrieve the latest 10-K reports from SEC EDGAR using Python. This technique is useful for investment research, corporate analysis, and data collection automation.

If you have any questions or suggestions for improvement, feel free to reach out! 😊

Similar Posts