Skip to content

Latest commit

 

History

History

README.md

Privacy Detection Extended Database

IP Address to privacy detection database with detection method

IPinfo's IP Address to Privacy Detection (Extended) database includes IP addresses that have been used by VPNs, proxies, relays, TOR and hosting services that can be used to hide a user's location. The database showcases how detect the privacy service associated with an IP address.

Database Schema & Description

[data updated as of January, 2025]

Field Name Example Data Type Descrption
network 45.129.35.234 TEXT CIDR/IP Range or single IP address block
hosting true BOOLEAN Indicates a hosting/cloud service/data center IP address
proxy false BOOLEAN Indicates a open web proxy IP address
relay false BOOLEAN Indicates location preserving anonymous relay service
tor false BOOLEAN Indicates a TOR (The Onion Router) exit node IP address
vpn true BOOLEAN Indicates Virtual Private Network (VPN) service exit node IP address
service NordVPN TEXT Name of the privacy service provider includes VPN, Proxy and Relay service providers names
first_seen 2024-10-31 DATE Date when the activity on an anonymous IP address was first observed: Date in YYYY-MM-DD format, ISO-8601. Within the 3-month lookback period.
last_seen 2025-01-03 DATE Date when the activity on an anonymous IP address was last/recently observed: Date in YYYY-MM-DD format, ISO-8601.
confidence 3 INTEGER The level (from 1 to 3) of confidence attributed to the best source associated with this range
coverage 1.0 FLOAT For inferred ranges (see inferred flag), represents the proportion of the range (in IP count) that we saw direct evidence of VPN activity on; the remaining percentage of the range (1 - coverage) is composed of IPs we did not directly observe. For IPs/ranges we've fully directly observed VPN evidence on, this value is 1.0.
census false BOOLEAN Ranges where we've observed VPN software/ports on; we run scans on ports and protocols commonly associated with VPN software. Ranges with the census flag are those where these scans obtained positive results
census_ports INTEGER The ports we've gotten positive results for when running our VPN detection census
device_activity false BOOLEAN Ranges on which we've observed device activity compatible with VPN usage (outside of known infrastructure area; simultaneous use around a large area; pingable and/or associated with hosting providers)
inferred false BOOLEAN Whether the range associated with the record is the result of direct observation or inference based on neighboring IPs
vpn_config true BOOLEAN Ranges where we confirmed VPN activity by directly running VPN software from almost 200 different providers and collecting exit IPs
whois false BOOLEAN Ranges where we've observed VPN software/ports on AND have a WHOIS association with either VPNs in general or specific VPN providers. e.g. if our ipsec scan returned a positive result for an IP and its WHOIS record indicates that it is owned by a VPN provider, this flag will be true.

Note that the network field can contain either CIDR-noted ranges or individual IP addresses.

Confidence intervals defined:

Confidence Level Description
3 Direct observation of commercial use (vpn_config)
2 Direct observation of VPN software running on the range (census) + registrar information associated with VPNs or specific providers OR highly convincing device activity (large spread of devices on pingable networks that are not associated with carrier traffic and known to be associated with hosting providers)
1 Direct observation of VPN software running on the range (census) without known association to specific providers or VPNs in general OR device data that is suspicious but not associated with hosting ranges

Samples

Downloadable File Formats

  • CSV: Plain text file format where data is organized into rows, with individual values separated by commas.
  • JSON: More specifically, NDJSON (Newline Delimited JSON), a text file format where each line is a separated in valid JSON object.
  • MMDB: Specialized binary database for efficient and fast IP lookups.
  • Parquet: A columnar storage file format optimized for efficient data querying.

Please refer to "How to choose the best file format for your IPinfo database?" article to select the best format possible for your use case.

The usage of the IP data downloads relies on the software or application of the data. Check out our documentation, community, and our integrations pages to find the best path forward.

Filename references:

File Format Filename / Slug Terminal Command
CSV ipinfo_privacy_extended.csv.gz curl -L https://ipinfo.io/data/ipinfo_privacy_extended.csv.gz?token=$YOUR_TOKEN -o ipinfo_privacy_extended.csv.gz
MMDB ipinfo_privacy_extended.mmdb curl -L https://ipinfo.io/data/ipinfo_privacy_extended.mmdb?token=$YOUR_TOKEN -o ipinfo_privacy_extended.mmdb
JSON ipinfo_privacy_extended.json.gz curl -L https://ipinfo.io/data/ipinfo_privacy_extended.json.gz?token=$YOUR_TOKEN -o ipinfo_privacy_extended.json.gz
Parquet ipinfo_privacy_extended.parquet curl -L https://ipinfo.io/data/ipinfo_privacy_extended.parquet?token=$YOUR_TOKEN -o ipinfo_privacy_extended.parquet

Links

🔗 IP to Privacy Detection Extended Data Downloads Documentation

The IP to Privacy Detectio Extended Dataset is a customized product available only through request. Reach out to our sales team to get started.

Articles & Guides

FAQ (Frequently Asked Questions)

If you have any questions about using our data, feel free to ask us about it in the IPinfo Community.


Interested in more?

Currently, we are limiting the sample datasets to only 100 rows. If you would like to request a larger sample or would like to get a quote on the database products, feel free to reach to us.

Follow us on Twitter and LinkedIn to learn more about IP Address data and it’s fascinating potential.

About IPinfo

Founded in 2013, IPinfo prides itself on being the most reliable, accurate, and in-depth source of IP address data available anywhere. We process terabytes of data to produce our custom IP geolocation, company, carrier, VPN detection, hosted domains, and IP type data sets. Our API handles over 40 billion requests a month for 100,000 businesses and developers.

image