Retrieving the Latest 10-K Report URL from SEC EDGAR
This guide outlines how to automatically retrieve the latest 10-K report link for a specific company from the SEC (U.S. Securities and Exchange Commission) EDGAR system. We will use Python’s requests
and BeautifulSoup
libraries to perform web scraping.
1. Overview of Searching for 10-K Reports on SEC EDGAR
SEC EDGAR is a public database that provides financial reports (10-K, 10-Q, etc.) for publicly traded U.S. companies. To search for a company’s 10-K report, use the following URL:
https://www.sec.gov/cgi-bin/browse-edgar?CIK=<ticker>&type=10-K&count=10&action=getcompany
For example, to search for Apple’s (AAPL) 10-K report, use:
https://www.sec.gov/cgi-bin/browse-edgar?CIK=AAPL&type=10-K&count=10&action=getcompany
This page contains a “Documents” link that leads to the detailed document page for the report.
2. Automatically Extracting the 10-K Report Link with Python
2.1 Install Required Libraries
First, install the necessary libraries:
pip install requests beautifulsoup4
2.2 Python Code to Retrieve the 10-K Report URL
The following code automatically finds and returns the latest 10-K report link for a specified company from SEC EDGAR.
import requests
from bs4 import BeautifulSoup
def get_latest_10k_url(ticker):
# SEC EDGAR search page URL
search_url = f"https://www.sec.gov/cgi-bin/browse-edgar?CIK={ticker}&type=10-K&count=10&action=getcompany"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36 Edg/133.0.0.0",
"Referer": search_url
}
session = requests.Session()
session.headers.update(headers)
# Request the SEC EDGAR search page
response = session.get(search_url)
soup = BeautifulSoup(response.text, "html.parser")
# Find the "Documents" button
doc_button = soup.find("a", string="Documents")
if not doc_button:
print(f"10-K report for {ticker} not found. (Documents button missing)")
return None
# Navigate to the "Documents" page
docs_url = "https://www.sec.gov" + doc_button["href"]
response = session.get(docs_url)
soup = BeautifulSoup(response.text, "html.parser")
# Locate the "Document Format Files" table
table = soup.find("table", {"summary": "Document Format Files"})
if not table:
print(f"Document table for {ticker} not found.")
return None
latest_10k_url = None
for row in table.find_all("tr"):
cols = row.find_all("td")
# Find the "10-K" document
if len(cols) > 1 and "10-K" in cols[1].text:
doc_link = cols[2].find("a")["href"]
latest_10k_url = "https://www.sec.gov" + doc_link
break
if latest_10k_url:
print(f"✅ Latest 10-K Report URL: {latest_10k_url}")
return latest_10k_url
else:
print(f"⚠️ Latest 10-K report for {ticker} not found.")
return None
# Retrieve the latest 10-K report URL for Apple (AAPL)
ticker = "AAPL"
latest_10k_url = get_latest_10k_url(ticker)
3. Example Output
Running the above code will produce an output similar to:
✅ Latest 10-K Report URL: https://www.sec.gov/ix?doc=/Archives/edgar/data/0000320193/000032019324000123/aapl-20240928.htm
This allows automatic retrieval of the latest 10-K report link for Apple (AAPL).
4. Potential Enhancements
This script provides basic functionality but can be extended further:
- Batch processing for multiple companies
- Automated retrieval of 10-Q (quarterly reports) and other filings
- NLP-based extraction of key insights from reports
- Selenium integration for dynamic crawling
5. Conclusion
With this method, you can automatically retrieve the latest 10-K reports from SEC EDGAR using Python. This technique is useful for investment research, corporate analysis, and data collection automation.
If you have any questions or suggestions for improvement, feel free to reach out! 😊