Open Big Data

a directory of open access datasets for social science research

The Economist Historical Advertisements – Industry Subset “Banking”

This dataset contains metadata of 92,592 historical advertisements from the banking industry, from all 8,840 issues of The Economist magazine, years 1843 to 2014. It is part of a series of datasets related to The Economist Historical Archive (

Keywords: Advertising, Marketing, Historic, The Economist, Banking, Finance


  1. Filename: Unique identifier of this advertisement
  2. URLs TheEconomistPageScans: comma separated list of URLs to JPG image files of scanned The Economist pages containing this ad. For multi page ads this can be multiple URLs.
  3. Date of Issue: Date of The Economist issue (Years-Month-Day)
  4. Ad size (pages): e.g. 1 = one full page, 0.75 = 3/4 of a page, 2 = two pages
  5. Ad size < 1/4: 1 if Ad covers less than 25% of the page; 0 if Ad does not cover less than 25% of the page
  6. 1/4 <= Ad size < 2/4: 1 if Ad covers at least 25% of the page, but less than 50%; 0 if Ad size is not in this range
  7. 2/4 <= Ad size < 3/4
  8. 3/4 <= Ad size < 4/4
  9. 4/4 <= Ad size < 8/4
  10. 8/4 <= Ad size
  11. Bounding Box relative X1: Left-top coordinate of a rectangle identifying the ad on the page, relative to the pixel coordinates of the image from column 2 (“URLs …”). Multiply this value by the width of the image to get the absolute x coordinate. If the ad is a multi page ad, the images from column 2 have to be horizontally concatenated first.
  12. Bounding Box relative Y1: Left-top coordinate
  13. Bounding Box relative X2: Right-bottom coordinate
  14. Bounding Box relative Y2: Right-bottom coordinate
  15. Feature Complexity (JPG file size in kb / Ad Size): More complex images will have higher values.
  16. JPG File Size (Byte): e.g. 186609
  17. OCR GoogleVision: Advertisement text, based on text recognition using Google Vision API (2021) of the full ad image.
  18. Brand: Brand name of advertiser
  19. Brand is generic (e.g. ‘Notices’): If “True” then this ad doesn’t represent a single brand, but a category of ad-like content. Most common categories are “Notices”, “Appointments”, “Courses”.
  20. Text Class GoogleVision: Based on the OCR text the ad was classified using GoogleVision API (2021). See full list of categories. This column contains a JSON string with a list of text classes and their class probabilities.
  21. Category most confident, Level 1: Top level category from Google Vision text analysis for this ad. E.g. “/Finance
  22. Category most confident, Level 2: e.g. “/Finance/Banking”
  23. Category most confident, Level 3
  24. Colorfulness (Hasler & Suesstrunk, 2003): Colorfulness of ad, based on this paper.
  25. Color variety (Ke et al., 2006): Color variety of ad, based on this paper.
  26. Brightness_Mean: Mean of brightness values of all pixels in ad.
  27. Brightness_SD: Standard deviation of brightness values of all pixels.
  28. Red_Mean: Mean value of redness of all pixels.
  29. Red_SD
  30. Green_Mean
  31. Green_SD
  32. Blue_Mean
  33. Blue_SD
  34. Text readability Gunning Fog: Text readability measure according to Gunning Fog index.
  35. Text readability SMOG
  36. Text readability Flesch Reading Ease
  37. Text readability Dale Chall


Kluge, S., Gehrmann, L., Stahl, F., Knäble, M., Nadj, M., Maedche, A.

Funding / Grants



Creative Commons Attribution 4.0 International


FilenameURLs_TheEconomistPageScansDateOfIssueAd size (pages)Ad size < 1/41/4 <= Ad size < 2/42/4 <= Ad size < 3/43/4 <= Ad size < 4/44/4 <= Ad size < 8/48/4 <= Ad sizeBounding Box relative X1Bounding Box relative Y1Bounding Box relative X2Bounding Box relative Y2Feature Complexity (JPG file size in kb / Ad Size)JPG File Size (Byte)OCR_GoogleVision_originalBrandCategories (from Google Natural Language API classify_text based on OCR text)Category most confident, Level 1Category most confident, Level 2Category most confident, Level 3Colorfulness (Hasler & Suesstrunk, 2003)Color variety (Ke et al., 2006)Brightness_MeanBrightness_SDRed_MeanRed_SDGreen_MeanGreen_SDBlue_MeanBlue_SDText readability Gunning FogText readability SMOGText readability Flesch Reading EaseText readability Dale Chall
2013-0302-0065_000_99.jpg and Wherever
You Have Business Abroad …
… Hanover’s complete international
banking services are immediately
available to you. Offices in New York,
London and Paris-and correspondents
throughout the world-assure you of
prompt, current information on foreign
(Incorporated with Limited Liability in U.S.A.)
LONDON…7 Princes Street, E. C. 2…10 Mount Street, W. 1
NEW YORK … 70 Broadway ππ
The Hanover Bank{“/Finance/Banking”: 0.699999988079071}/Finance/Finance/Banking0.00.038.240526111434780.335418918499938.240526111434780.335418918240438.240526111434780.335418918240438.240526111434780.33541891824049.9810.944.29.32

Related Datasets

The Economist Historical Advertisements – Master Dataset

The Economist Historical Advertisements – Faces Dataset

The Economist Historical Advertisements – Objects Dataset

Sign In


Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.