How to Scan Documents from the Command Line

Jul 30, 2025

In this article, we are going to talk about how to scan documents from the command line (CLI), which allows scanning and saving documents to be automated and/or scripted.

There are different APIs to access document scanners and here is a comparison table about them.

Feature	TWAIN	WIA (Windows Image Acquisition)	SANE (Scanner Access Now Easy)	ICA (Image Capture Architecture)	eSCL
Developer	TWAIN Working Group	Microsoft	SANE Open-Source Community	Apple	Mopria
Operating Systems	Windows, macOS, Linux (partial)	Windows	Linux, macOS, Unix-like	macOS, iOS	Cross-platform (Windows/macOS/Linux/mobile)
Supported Scanners	Broad	Less broad	Less broad (community support)	Less broad	Only modern network scanners/MFPs
Functionality	Advanced controls (ADF, barcode detection, etc)	Basic controls like color mode	Medium-to-advanced controls	Basic controls like color mode	Basic controls like color mode

We are going to use all the APIs to scan documents from the command line. Since only SANE provides a command-line tool and the others do not, we need to write command-line tools to use the other APIs.

SANE

SANE has a command-line tool scanimage. Here is its basic usage:

List connected scanners.
```
scanimage -L
```
Acquire an image with a specified scanner.
```
scanimage -d "scanner name" -o out.png
```

Command line tools of the other APIs we are going to write will have the same usage.

TWAIN

The TWAIN interface is implemented with C++ and has a Python library. We are going to use Python to write the command-line tool.

Here are the key parts:

Import the library.
```
import twain
```

List scanners.

with twain.SourceManager() as sm:
    for source in sm.source_list:
        print(source)

Scan with a scanner.

from PIL import Image
from io import BytesIO
with twain.SourceManager() as sm:
   src = sm.open_source("scanner_name")
   src.request_acquire(show_ui=False, modal_ui=False)
   (handle, remaining_count) = src.xfer_image_natively()
   bmp_bytes = twain.dib_to_bm_file(handle)
   img = Image.open(BytesIO(bmp_bytes), formats=["bmp"])
   img.save("output_path")

WIA

WIA provides APIs as well as a COM layer. We are going to use Python and COM to use WIA.

Here are the key parts:

Import libraries.

from PIL import Image
import pythoncom
from win32com.client import Dispatch

List scanners.

manager = Dispatch("WIA.DeviceManager")
devices = manager.DeviceInfos
print("Available scanners:")
for i in range(1, devices.Count + 1):
    device = devices.Item(i)
    # Check if the device is a scanner (Type = 1)
    if device.Type == 1:
        print(f"  Name: {device.Properties['Name'].Value}")
        print(f"  ID: {device.DeviceID}")
        print(f"  Description: {device.Properties['Description'].Value}")
        print("  ----------------")

Scan with a scanner.

wia = Dispatch("WIA.CommonDialog")
manager = Dispatch("WIA.DeviceManager")
   
devices = manager.DeviceInfos
selected_device = None
scanner_name = "target scanner name"
for i in range(1, devices.Count + 1):
    device = devices.Item(i)
    if device.Type == 1 and device.Properties['Name'].Value == scanner_name:
        selected_device = device.Connect() # Select the scanner by name
        break
           
img = None
if selected_device is None:
    img = wia.ShowAcquireImage()  # Show scanning dialog with scanner selection
else:
    img = wia.ShowTransfer(selected_device.Items[1])  # Transfer the scanned image using the selected scanner
   
#save the image
pil_img = Image.fromarray(img) 
pil_img.save(output_path)

eSCL

eSCL is a RESTful interface. The network scanners broadcast themselves via Bonjour and the client can find them and send HTTP requests to scan documents. We are going to use Python as well to write the scanning tool.

Import the libraries.

from zeroconf import ServiceBrowser, Zeroconf
from requests import get as requests_get, post as requests_post

List scanners by detecting Bonjour services whose type is _uscan._tcp.local..

class ESCLScannerListener:
    def __init__(self):
        self.scanners = []

    def add_service(self, zeroconf, type, name):
        info = zeroconf.get_service_info(type, name)
        if info:
            addresses = ["%s:%d" % (addr, info.port) for addr in info.addresses]
            scanner_info = {
                'name': name,
                'type': type,
                'addresses': info.addresses,
                'port': info.port,
                'properties': info.properties
            }
            self.scanners.append(scanner_info)

    def remove_service(self, zeroconf, type, name):
        print(f"Scanner removed: {name}")
           
def discover_escl_scanners(timeout=2):
    zeroconf = Zeroconf()
    listener = ESCLScannerListener()
    browser = ServiceBrowser(zeroconf, "_uscan._tcp.local.", listener)
    print(f"Discovering ESCL scanners for {timeout} seconds...")
    time.sleep(timeout)
    zeroconf.close()
    return listener.scanners

Scan with a scanner. The scanning configuration is expressed in XML.

def scan(scanner_address, output_path="scanned.jpg"):
    xml = '''<scan:ScanSettings xmlns:scan="http://schemas.hp.com/imaging/escl/2011/05/03" xmlns:dd="http://www.hp.com/schemas/imaging/con/dictionaries/1.0/" xmlns:dd3="http://www.hp.com/schemas/imaging/con/dictionaries/2009/04/06" xmlns:fw="http://www.hp.com/schemas/imaging/con/firewall/2011/01/05" xmlns:scc="http://schemas.hp.com/imaging/escl/2011/05/03" xmlns:pwg="http://www.pwg.org/schemas/2010/12/sm"><pwg:Version>2.1</pwg:Version><scan:Intent>Photo</scan:Intent><pwg:ScanRegions><pwg:ScanRegion><pwg:Height>3300</pwg:Height><pwg:Width>2550</pwg:Width><pwg:XOffset>0</pwg:XOffset><pwg:YOffset>0</pwg:YOffset></pwg:ScanRegion></pwg:ScanRegions><pwg:InputSource>Platen</pwg:InputSource><scan:DocumentFormatExt>image/jpeg</scan:DocumentFormatExt><scan:XResolution>300</scan:XResolution><scan:YResolution>300</scan:YResolution><scan:ColorMode>Grayscale8</scan:ColorMode><scan:CompressionFactor>25</scan:CompressionFactor><scan:Brightness>1000</scan:Brightness><scan:Contrast>1000</scan:Contrast></scan:ScanSettings>'''

    resp = requests_post('http://{0}/eSCL/ScanJobs'.format(scanner_address), data=xml, headers={'Content-Type': 'text/xml'})
    if resp.status_code == 201:
        url = '{0}/NextDocument'.format(resp.headers['Location'])
        r = requests_get(url) 
        with open(output_path,'wb') as f:
            f.write(r.content)

ICA

Using the Image Capture API is a bit complicated, we are going to create a Swift command-line project to implement the tool.

Here are the key parts:

Create a scanner manager class to list the scanners.

class ScannerManager: NSObject, ICDeviceBrowserDelegate {
    private var deviceBrowser: ICDeviceBrowser!
    private var scanners: [ICScannerDevice] = []
    private var currentScanner: ICScannerDevice?
    private var scanCompletionHandler: ((Result<URL, Error>) -> Void)?
    private var targetURL: URL?
       
    override init() {
        super.init()
        setupDeviceBrowser()
    }
       
    private func setupDeviceBrowser() {
        deviceBrowser = ICDeviceBrowser()
        deviceBrowser.delegate = self
        let mask = ICDeviceTypeMask(rawValue:
                    ICDeviceTypeMask.scanner.rawValue |
                    ICDeviceLocationTypeMask.local.rawValue |
                    ICDeviceLocationTypeMask.bonjour.rawValue |
                    ICDeviceLocationTypeMask.shared.rawValue)
        deviceBrowser.browsedDeviceTypeMask = mask!
        deviceBrowser.start()
    }
       
    func listScanners(completion: @escaping ([ICScannerDevice]) -> Void) {
        DispatchQueue.main.asyncAfter(deadline: .now() + 1) {
            completion(self.scanners)
        }
    }
       
    // MARK: - ICDeviceBrowserDelegate
       
    func deviceBrowser(_ browser: ICDeviceBrowser, didAdd device: ICDevice, moreComing: Bool) {
        guard let scanner = device as? ICScannerDevice else { return }
        scanners.append(scanner)
    }
       
    func deviceBrowser(_ browser: ICDeviceBrowser, didRemove device: ICDevice, moreGoing: Bool) {
        if let index = scanners.firstIndex(where: { $0 == device }) {
            scanners.remove(at: index)
        }
    }
}

Let the manager class inherit ICScannerDeviceDelegate and add the scanning-related functions.

func device(_ device: ICDevice, didCloseSessionWithError error: (any Error)?) {
    print("did close")
}

func didRemove(_ device: ICDevice) {
    print("did remove")
}

func device(_ device: ICDevice, didOpenSessionWithError error: (any Error)?) {
    print("did open")
    DispatchQueue.main.asyncAfter(deadline: .now() + 1) { [weak self] in
        guard let self = self else { return }
        guard let scanner = currentScanner else { return }
        scanner.transferMode = .fileBased
        scanner.downloadsDirectory = URL(fileURLWithPath: NSTemporaryDirectory())
        scanner.documentName = "scan"
        scanner.documentUTI = kUTTypeJPEG as String
        if let functionalUnit = scanner.selectedFunctionalUnit as? ICScannerFunctionalUnit {
            let resolutionIndex = functionalUnit.supportedResolutions.integerGreaterThanOrEqualTo(300) ?? functionalUnit.supportedResolutions.last
            if let resolutionIndex = resolutionIndex ?? functionalUnit.supportedResolutions.last {
                functionalUnit.resolution = resolutionIndex
            }
               
            let a4Width: CGFloat = 210.0 // mm
            let a4Height: CGFloat = 297.0 // mm
            let widthInPoints = a4Width * 72.0 / 25.4 // convert to point
            let heightInPoints = a4Height * 72.0 / 25.4
               
            functionalUnit.scanArea = NSMakeRect(0, 0, widthInPoints, heightInPoints)
            functionalUnit.pixelDataType = .RGB
            functionalUnit.bitDepth = .depth8Bits

            scanner.requestScan()
        }
    }
}

// MARK: - ICScannerDeviceDelegate

func scannerDevice(_ scanner: ICScannerDevice, didScanTo url: URL) {
    print("did scan to")
    print(url.absoluteString)
    guard let targetURL = targetURL else {
        scanCompletionHandler?(.failure(NSError(domain: "ScannerError", code: -2, userInfo: [NSLocalizedDescriptionKey: "No target URL set"])))
        return
    }
    do {
        try FileManager.default.moveItem(at: url, to: targetURL)
        scanCompletionHandler?(.success(targetURL))
    } catch {
        scanCompletionHandler?(.failure(error))
    }
}

// MARK: - Scan Operations

func startScan(scanner: ICScannerDevice, outputPath: String, completion: @escaping (Result<URL, Error>) -> Void) {
    currentScanner = scanner
    scanCompletionHandler = completion
    targetURL = URL(fileURLWithPath: outputPath)
       
    scanner.delegate = self
    scanner.requestOpenSession()
}

Dynamic Web TWAIN RESTful API

Dynamic Web TWAIN provides a RESTful API feature for scanning documents using TWAIN, WIA, SANE, ICA or eSCL. You can find its details on this page.

Here are the benefits of using Dynamic Web TWAIN’s RESTful API:

One unified interface to use all the mainstream document scanning APIs with complete scanner controls on different platforms.
Share scanners via the network so that mobile devices can also access document scanners.
We can use programming languages we like to use the document scanning APIs.

Here are the key parts using the Python wrapper:

Import the library and declare several variables. You can apply for a license here.

from dynamsoftservice import ScannerController, ScannerType
license_key = "LICENSE-KEY"
host = "http://127.0.0.1:18622"
scannerController = ScannerController()

List scanners.

def list_scanners():
    """List all available scanners"""
    scanners = scannerController.getDevices(host)
    return scanners

Scan with a scanner.

def scan_document(output_path="scan.png", scanner_name=None):
    """
    Scan a document using Web TWAIN service and save as image file
       
    Parameters:
        output_path: Path to save scanned image
        scanner_name: Name of specific scanner to use (None shows dialog)
    """
    scanners = list_scanners()
    selectedScanner = None
    if scanner_name is not None:
        for scanner in scanners:
            if scanner['name'] == scanner_name:
                selectedScanner = scanner
                break
       
    parameters = {
        "license": license_key
    }

    if selectedScanner is not None:
        parameters["device"] = selectedScanner["device"]
           
    parameters["config"] = {
        "IfShowUI": False,
        "PixelType": 2,
        "Resolution": 200,
        "IfFeederEnabled": False,
        "IfDuplexEnabled": False,
    }
       
    job = scannerController.createJob(host, parameters)
    print(job)
    if "jobuid" in job:
        job_id = job["jobuid"]
        stream = scannerController.getImageStreams(host,job_id)[0]
        with open(output_path,"wb") as f:
            f.write(stream)
            f.close()
    return output_path

Apart from the RESTful API, Dynamic Web TWAIN also provides a JavaScript library with a dedicated viewer, complete wrapping of the document scanning APIs, local cache and various supplementary APIs to provide a browser-based document scanning solution. Visit its online demo to have a try.

Source Code

Get the source code on GitHub and learn about how to use the command line tools:

https://github.com/tony-xlh/document-scanner-cli/

How to Scan Documents from the Command Line

SANE

TWAIN

WIA

eSCL

ICA

Dynamic Web TWAIN RESTful API

Source Code

External Links