bitsavers.org

bitsavers.org

Bitsavers'
Software Archive
Computing Archive
Communications Archive
Components Archive
Magazine Archive
Test Equipment Archive


2021-07-27

bitsavers.org has a new(ish) home

A HUGE thank you to Jay West for hosting bitsavers for 20 years



As of Aug, 2021 there are over 115000 files including over 6 million text pages in the archive.
This is around 26 bookcases 4 feet wide with 8 shelves at 600 pages/inch, double-sided.

7200 pgs/ft x 4 x 8 x 26 = 5990400


Bitsavers Updates RSS

RSS feeds for bitsavers updates are available
bits
communications
components
magazines
pdf
test_equipment

Twitter

bitsavers' twitter feed

Active Mirrors

Web

bitsavers.computerhistory.org
bitsavers.informatik.uni-stuttgart.de
bitsavers.trailing-edge.com
www.bighole.nl This appears to be updated on Mondays and isn't using rsync
University of Kent
ftpmirror.your.org

FTP

bitsavers.informatik.uni-stuttgart.de
University of Kent
ftpmirror.your.org

RSYNC

ftpmirror.your.org

rsync is the preferred method for cloning and syncing with the archive.
This site has no javascript, data bases or any of that Web 2.0 stuff
You can clone the entire archive with
rsync -av rsync://bitsavers.org:/bitsavers/ bitsavers/
As of Jul, 2021, the entire archive is around 600gb

If you are syncing, be warned that file names, dates and their location in the hierarchy change (these aren't permalinks)

Archive Indexing

An index file is maintained at the top level of each category heirarchy
IndexByDate.txt is updated each time an indexed document is added to the archive.
These files are what drives the rss feeds

Snapshots/Mirrors

  • Jul 2004 shapshot of pdp-11.trailing-edge.com
  • Jan 2005 shapshot of simh.trailing-edge.com
  • Jun 2012 snapshot of simh.trailing-edge.com
  • scans from the University of Queensland

The PDF Document Format

Documents here are kept in a minimal subset of PDF format, just using it as a
container for lossless Group 4 fax compression (ITU-T recommendation T.6) images.
Contributions are normally post-processed by tools to put them in exactly this format.

Documents were scanned using a Ricoh IS520 400dpi 30ppm B&W duplex production scanner
from the late 90's through 2007.

Conversion to higher performance Kodak DS 2500D scanning occured in July, 2007.
The 2500D is an OEM version of the Panasonic KV-S2055 scanner.

In 2008, the Kodak was replaced by a Panasonic KV-S3065W, which
is capable of duplex color 600dpi scanning, and has the capability to scan
sheets 100 inches long.

Post-processing is done using Lemkesoft's Graphic Converter
TIFF to PDF conversion is done using Eric Smith's tumble
A final OCR step is done with Acrobat Pro.
I've continued to use tumble since it is MUCH faster than Acrobat for tif to pdf conversion.

The preferred form for any contributed text scan is as a collection of lossless
Group 4 fax compression (ITU-T recommendation T.6) images saved as TIFF
files with a minium scan resolution of 400 dpi.

Lower scan resolutions produce noticable artifacts if a page needs to be
straightened in post-processing.

Lossy compression formats, such as JPEG, should NEVER be used to save pages
of text, since the compression format destroys edge resolution and contrast

OCR

OCR has been part of the post-processing of scans for many years now
and is slowly being applied to older pdf files. It is a slow process and
it will take many years to complete.

Document Scanning Station



Tape processing over the years

These were taken in rooms that no longer exist at CHM, ca. 2006.
The rooms were demolished when the Revolution exhibit was built.
They were roughly where the gift shop and orientation theatre are now.
You can see four XServe RAIDs which are still in use in 2021 with 2.5" 1tb Toshiba SATA drives and PATA/SATA adapters.


Where does the source material come from?

Most of the documents are from my personal collection that I have either bought or been given over the course of many decades in the computer industry, or have been loaned to me for scanning.

I have a VERY large backlog of material to scan and don't actively sollicit material to work on.

If I do decide to scan something from a donor I will return it if requested.

Unless it is a very rare document I probably won't accept something that requires manual scanning, since scanning time in my day is limited.

I do not personally archive any paper that has been scanned.

The scanning process I use is destructive. Bindings are removed and paper is recycled.

Original documents that are still in good condition may be donated to the Computer History Museum for archiving, depending on if they are within CHM's collecting scope.

The CHM running lot number for my donated documents is X6512.2012

This project was started to downsize my collection of paper in the early 90's and continues to be its primary purpose.
and.. the site looks this way for a reason, to leave it static and easy to mirror, so don't remind me that it looks like it's from 1995

at bitsavers dot org