Mixing Objective-C and C++ in macOS to Build a Camera-Based Barcode Scanner

Dec 25, 2024

This is the final article in a series on building a lightweight C++ camera library across multiple platforms. We’ve already covered Windows and Linux, and now it’s time to dive into macOS. In this article, we’ll tap into AVFoundation to handle camera capture under the hood and leverage Cocoa for our UI essentials. You’ll see how to bridge Objective-C with C++ to build a camera library, and then integrate it with the Dynamsoft Barcode Reader SDK to create a barcode scanner on macOS.

This article is Part 3 in a 6-Part Series.

macOS Camera Demo Video

Let’s review the camera-related functions that need to be implemented for macOS:

std::vector<CaptureDeviceInfo> ListCaptureDevices(): Enumerates available cameras.
bool Open(int cameraIndex): Activates a specified camera.
void Release(): Releases the camera.
FrameData CaptureFrame(): Captures a frame from the camera.

Updating the Header File for macOS

Open the Camera.h header file and add the following changes:

class CAMERA_API Camera
{
public:
#ifdef _WIN32
    ...
#elif __APPLE__
    Camera() noexcept; 
    ~Camera();
#endif

    ...

private:
    ...

#ifdef __APPLE__
    void *captureSession; 
    void *videoOutput;
#endif
};

#endif 

Explanation

noexcept is used to indicate that the function does not throw exceptions.
captureSession and videoOutput are used to manage the camera capture session and video output.

Querying Cameras

Enumerate available cameras using the AVFoundation API:

std::vector<CaptureDeviceInfo> ListCaptureDevices()
{
    @autoreleasepool {
        std::vector<CaptureDeviceInfo> devicesInfo;

        NSArray<AVCaptureDevice *> *devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
        for (AVCaptureDevice *device in devices)
        {
            CaptureDeviceInfo info = {};
            strncpy(info.friendlyName, [[device localizedName] UTF8String], sizeof(info.friendlyName) - 1);
            devicesInfo.push_back(info);
        }

        return devicesInfo;
    }
}

Explanation

AVCaptureDevice represents a physical capture device.
AVMediaTypeVideo specifies the video media type.
localizedName retrieves the device’s name.

Opening a Camera

The steps to open a camera are as follows:

Get the available capture devices.
Create a device input and a capture session.
Configure the video output.
Start the capture session.

bool Camera::Open(int cameraIndex)
{
    @autoreleasepool {
        NSArray<AVCaptureDevice *> *devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
        if (cameraIndex >= [devices count])
        {
            std::cerr << "Camera index out of range." << std::endl;
            return false;
        }

        AVCaptureDevice *device = devices[cameraIndex];

        NSError *error = nil;
        AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:device error:&error];
        if (!input)
        {
            std::cerr << "Error creating device input: " << [[error localizedDescription] UTF8String] << std::endl;
            return false;
        }

        AVCaptureSession *cs = [[AVCaptureSession alloc] init];
        captureSession = (void *)cs;
        if (![cs canAddInput:input])
        {
            std::cerr << "Cannot add device input to session." << std::endl;
            return false;
        }
        [cs addInput:input];

        AVCaptureVideoDataOutput *output = [[AVCaptureVideoDataOutput alloc] init];
        output.videoSettings = @{(NSString *)kCVPixelBufferPixelFormatTypeKey: @(kCVPixelFormatType_32BGRA)};
        output.alwaysDiscardsLateVideoFrames = YES;

        videoOutput = (void *)output;

        if (![cs canAddOutput:output])
        {
            std::cerr << "Cannot add video output to session." << std::endl;
            return false;
        }
        [cs addOutput:output];

        [cs startRunning];

        return true;
    }
}

Explanation

AVCaptureDeviceInput captures data from the chosen camera device.
AVCaptureSession manages the flow of data from the device to the output. We store it in captureSession.
AVCaptureVideoDataOutput provides the raw video frames from the capture device. We store it in videoOutput.
kCVPixelBufferPixelFormatTypeKey specifies the pixel format type.

Closing a Camera

Stop the capture session and release the resources:

void Camera::Release()
{
    if (captureSession)
    {
        AVCaptureSession *session = (__bridge AVCaptureSession *)captureSession;
        
        if (videoOutput)
        {
            AVCaptureVideoDataOutput *output = (__bridge AVCaptureVideoDataOutput *)videoOutput;
            [session removeOutput:output];
            videoOutput = nil;
        }

        [session stopRunning];
        captureSession = nil;
    }
}

Capturing a Frame

Capture a frame from the camera using the AVCaptureVideoDataOutputSampleBufferDelegate protocol:

@interface CaptureDelegate : NSObject <AVCaptureVideoDataOutputSampleBufferDelegate>
{
    FrameData *frame;
    dispatch_semaphore_t semaphore;
}

- (instancetype)initWithFrame:(FrameData *)frame semaphore:(dispatch_semaphore_t)semaphore;

@end

@implementation CaptureDelegate

- (instancetype)initWithFrame:(FrameData *)frame semaphore:(dispatch_semaphore_t)semaphore {
    self = [super init];
    if (self) {
        self->frame = frame;
        self->semaphore = semaphore;
    }
    return self;
}

- (void)captureOutput:(AVCaptureOutput *)output
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
       fromConnection:(AVCaptureConnection *)connection {
    
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);

    if (!imageBuffer) {
        std::cerr << "Failed to get image buffer." << std::endl;
        dispatch_semaphore_signal(semaphore);
        return;
    }

    CVPixelBufferLockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
    void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    
    frame->width = width;
    frame->height = height;
    frame->size = width * height * 3;
    frame->rgbData = new unsigned char[frame->size];

    OSType pixelFormat = CVPixelBufferGetPixelFormatType(imageBuffer);

    if (pixelFormat == kCVPixelFormatType_32BGRA) {
        unsigned char *src = (unsigned char *)baseAddress;
        unsigned char *dst = frame->rgbData;

        for (size_t y = 0; y < height; ++y) {
            for (size_t x = 0; x < width; ++x) {
                size_t offset = y * bytesPerRow + x * 4;
                dst[0] = src[offset + 2]; 
                dst[1] = src[offset + 1]; 
                dst[2] = src[offset + 0]; 
                dst += 3;
            }
        }
    } else {
        std::cerr << "Unsupported pixel format." << std::endl;
    }

    CVPixelBufferUnlockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
    dispatch_semaphore_signal(semaphore);
}


@end

FrameData Camera::CaptureFrame()
{
    @autoreleasepool {
        FrameData frame = {};

        if (!captureSession || !videoOutput) {
            std::cerr << "Capture session is not initialized." << std::endl;
            return frame;
        }

        AVCaptureSession *session = (__bridge AVCaptureSession *)captureSession;
        AVCaptureVideoDataOutput *vo = (__bridge AVCaptureVideoDataOutput *)videoOutput;
 
        dispatch_semaphore_t semaphore = dispatch_semaphore_create(0);

        [vo setSampleBufferDelegate:[[CaptureDelegate alloc] initWithFrame:&frame semaphore:semaphore]
                                       queue:dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0)];

        dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER);

        frameWidth = frame.width;
        frameHeight = frame.height;
        return frame;
    }
}

Explanation

setSampleBufferDelegate sets the object that will receive frames from the AVCaptureVideoDataOutput.
FrameData *frame points to a struct where the captured image data is stored (width, height, and the RGB buffer).
dispatch_semaphore_t semaphore is used for synchronization. Once the frame is processed, the delegate signals the semaphore so the calling code knows the frame is ready.
In captureOutput:didOutputSampleBuffer:fromConnection, we extract the pixel buffer and convert BGRA to RGB (3 bytes per pixel).
CMSampleBufferGetImageBuffer(sampleBuffer) returns a CVImageBufferRef containing the pixel data.
CVPixelBufferLockBaseAddress and CVPixelBufferUnlockBaseAddress ensure thread safety during reading.
dispatch_semaphore_signal(semaphore) notifies whoever is waiting on this semaphore that the frame data is now ready.

Updating the Header File for macOS

Add macOS-specific members to the CameraPreview.h header file:

class CAMERA_API CameraWindow
{
public:
    ...

#ifdef _WIN32
    ...
#elif __APPLE__
    void *nsWindow; 
    void *contentView;
#endif
};

Explanation

nsWindow and contentView are used to manage the window and the corresponding content view on macOS.

A Custom NSView for Drawing Camera Frames, Contours, and Text

Create a custom NSView subclass to handle drawing:

struct CameraContentViewImpl {
    std::vector<unsigned char> rgbData;
    int frameWidth = 0;
    int frameHeight = 0;
    int x = 0;
    int y = 0;
    int fontSize = 0; 
    std::vector<std::pair<int, int>> contourPoints;
    std::string displayText;
    CameraWindow::Color textColor;
};

@interface CameraContentView : NSView
{
    CameraContentViewImpl* impl; 
}
- (void)updateFrame:(const unsigned char*)data width:(int)width height:(int)height;
- (void)updateContour:(const std::vector<std::pair<int, int>>&)points;
- (void)updateText:(const std::string&)text
                x:(int)x
                y:(int)y
         fontSize:(int)fontSize
            color:(const CameraWindow::Color&)color;
@end

@implementation CameraContentView

- (instancetype)initWithFrame:(NSRect)frameRect {
    self = [super initWithFrame:frameRect];
    if (self) {
        impl = new CameraContentViewImpl();
    }
    return self;
}

- (void)dealloc {
    delete impl;
    [super dealloc];
}

- (void)updateFrame:(const unsigned char*)data width:(int)width height:(int)height {
    impl->rgbData.assign(data, data + (width * height * 3));
    impl->frameWidth = width;
    impl->frameHeight = height;
    [self setNeedsDisplay:YES];
}

- (void)updateContour:(const std::vector<std::pair<int, int>>&)points {
    impl->contourPoints = points;
    [self setNeedsDisplay:YES];
}

- (void)updateText:(const std::string&)text
                x:(int)x
                y:(int)y
         fontSize:(int)fontSize
            color:(const CameraWindow::Color&)color {
    impl->displayText = text;
    impl->textColor = color;
    impl->x = x;
    impl->y = y;
    impl->fontSize = fontSize;
    impl->textColor = color;
    [self setNeedsDisplay:YES];
}

- (void)drawRect:(NSRect)dirtyRect {
    [super drawRect:dirtyRect];

    NSRect bounds = [self bounds];
    CGContextRef context = [[NSGraphicsContext currentContext] CGContext];
    if (impl->rgbData.empty() || impl->frameWidth == 0 || impl->frameHeight == 0) {
        return;
    }

    CGFloat scaleX = bounds.size.width / impl->frameWidth;
    CGFloat scaleY = bounds.size.height / impl->frameHeight;

    CGFloat scale = MIN(scaleX, scaleY);

    CGFloat offsetX = (bounds.size.width - (impl->frameWidth * scale)) / 2.0;
    CGFloat offsetY = (bounds.size.height - (impl->frameHeight * scale)) / 2.0;

    CGContextSaveGState(context); 

    CGContextTranslateCTM(context, offsetX, offsetY);
    CGContextScaleCTM(context, scale, scale);

    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    CGDataProviderRef provider = CGDataProviderCreateWithData(NULL, impl->rgbData.data(), impl->rgbData.size(), NULL);
    CGImageRef image = CGImageCreate(impl->frameWidth, impl->frameHeight, 8, 24, impl->frameWidth * 3, colorSpace, kCGBitmapByteOrderDefault | kCGImageAlphaNone, provider, NULL, false, kCGRenderingIntentDefault);

    CGRect rect = CGRectMake(0, 0, impl->frameWidth, impl->frameHeight);
    CGContextDrawImage(context, rect, image);

    CGImageRelease(image);
    CGDataProviderRelease(provider);
    CGColorSpaceRelease(colorSpace);

    if (!impl->contourPoints.empty()) {
        CGContextSaveGState(context); 

        CGContextSetLineWidth(context, 3.0 / scale); 
        CGContextSetStrokeColorWithColor(context, [[NSColor yellowColor] CGColor]);

        auto firstPoint = impl->contourPoints[0];
        CGContextMoveToPoint(context, firstPoint.first, impl->frameHeight - firstPoint.second);

        for (size_t i = 1; i < impl->contourPoints.size(); ++i) {
            auto point = impl->contourPoints[i];
            CGContextAddLineToPoint(context, point.first, impl->frameHeight - point.second);
        }

        CGContextClosePath(context);
        CGContextStrokePath(context);

        CGContextRestoreGState(context); 

        impl->contourPoints.clear(); 
    }

    CGContextRestoreGState(context); 

    if (!impl->displayText.empty()) {
        CGContextSaveGState(context); 
    
        CGFloat scaledX = impl->x * scale + offsetX;
        CGFloat scaledY = impl->y * scale + offsetY;

        NSColor *color = [NSColor colorWithRed:impl->textColor.r / 255.0 green:impl->textColor.g / 255.0 blue:impl->textColor.b / 255.0 alpha:1.0];

        NSDictionary *attributes = @{
            NSFontAttributeName : [NSFont systemFontOfSize:impl->fontSize * scale],
            NSForegroundColorAttributeName : color
        };

        NSPoint point = NSMakePoint(scaledX, bounds.size.height - scaledY - (impl->fontSize * scale));
        NSString *nsText = [NSString stringWithUTF8String:impl->displayText.c_str()];
        [nsText drawAtPoint:point withAttributes:attributes];

        CGContextRestoreGState(context);

        impl->displayText.clear(); 
    }
}

@end

Explanation

CameraContentViewImpl holds all the data required for rendering (frame data, text, etc.).
CameraContentView is an NSView subclass that draws the camera frame, contours, and text.
Calling setNeedsDisplay: triggers a redraw, which in turn calls drawRect.
In drawRect:, the scaling factors and offsets are calculated to maintain aspect ratio and center the image.

Window Delegate for Handling Window Events

Create a custom NSWindowDelegate to handle window events:

@interface CameraWindowDelegate : NSObject <NSWindowDelegate>
@end

@implementation CameraWindowDelegate
- (BOOL)windowShouldClose:(id)sender {
    [NSApp terminate:nil];
    return YES;
}
@end

Initializing Window and Content View

Initialize the window and content view, and store them in nsWindow and contentView:

bool CameraWindow::Create() {
    @autoreleasepool {
        if (NSApp == nil) {
            [NSApplication sharedApplication];
            [NSApp setActivationPolicy:NSApplicationActivationPolicyRegular];
            [NSApp finishLaunching];
        }

        NSRect contentRect = NSMakeRect(100, 100, width, height);
        NSUInteger styleMask = NSWindowStyleMaskTitled | NSWindowStyleMaskClosable | NSWindowStyleMaskResizable;
        NSWindow *window = [[NSWindow alloc] initWithContentRect:contentRect
                                                       styleMask:styleMask
                                                         backing:NSBackingStoreBuffered
                                                           defer:NO];
        if (!window) {
            return false;
        }

        [window setTitle:[NSString stringWithUTF8String:title.c_str()]];
        [window makeKeyAndOrderFront:nil];

        CameraContentView *cv = [[CameraContentView alloc] initWithFrame:contentRect];
        [window setContentView:cv];
        contentView = cv;

        CameraWindowDelegate *delegate = [[CameraWindowDelegate alloc] init];
        [window setDelegate:delegate];

        nsWindow = (void *)window;
        return true;
    }
}

Showing the Window

Bring the application to the front and make it active:

void CameraWindow::Show() {
    @autoreleasepool {
        [NSApp activateIgnoringOtherApps:YES];
    }
}

Processing a Keyboard Event

Capture keyboard input events:

bool CameraWindow::WaitKey(char key)
{
    @autoreleasepool {
        NSEvent *event = [NSApp nextEventMatchingMask:NSEventMaskAny
                                           untilDate:[NSDate distantPast]
                                              inMode:NSDefaultRunLoopMode
                                             dequeue:YES];
        if (event) {
            [NSApp sendEvent:event];

            if (event.type == NSEventTypeKeyDown) {
                NSString *characters = [event charactersIgnoringModifiers];
                if ([characters length] > 0) {
                    char pressedKey = [characters characterAtIndex:0];
                    if (key == '\0' || pressedKey == key || pressedKey == std::toupper(key)) {
                        return false;  
                    }
                }
            }
        }
        return true;
    }
}

Drawing the Camera Frame, Contours, and Text

Update the camera frame, contours, and text as follows:

void CameraWindow::ShowFrame(const unsigned char *rgbData, int frameWidth, int frameHeight) {
    if (contentView) {
        [contentView updateFrame:rgbData width:frameWidth height:frameHeight];
    }
}

void CameraWindow::DrawContour(const std::vector<std::pair<int, int>> &points) {
    if (contentView) {
        [contentView updateContour:points];
    }
}

void CameraWindow::DrawText(const std::string &text, int x, int y, int fontSize, const CameraWindow::Color &color) {
    if (contentView) {
        [contentView updateText:text x:x y:y fontSize:fontSize color:color];
    }
}

Updating CMakelists.txt for macOS

To build the library on macOS, update the CMakeLists.txt:

...

if (WIN32)
    ...
elseif (UNIX AND NOT APPLE)
    ...
elseif (APPLE)
    set(LIBRARY_SOURCES
        src/CameraMacOS.mm
        src/CameraPreviewMacOS.mm
    )
    set_source_files_properties(src/CameraMacOS.mm src/CameraPreviewMacOS.mm PROPERTIES COMPILE_FLAGS "-x objective-c++")
    set_source_files_properties(src/main.cpp PROPERTIES COMPILE_FLAGS "-x objective-c++")
endif()
...

if (UNIX AND NOT APPLE)
    ...
elseif (APPLE)
    find_library(COCOA_LIBRARY Cocoa REQUIRED)
    find_library(AVFOUNDATION_LIBRARY AVFoundation REQUIRED)
    find_library(COREMEDIA_LIBRARY CoreMedia REQUIRED)
    find_library(COREVIDEO_LIBRARY CoreVideo REQUIRED)
    find_library(OBJC_LIBRARY objc REQUIRED)  # Add the Objective-C runtime library

    target_link_libraries(litecam PRIVATE 
        ${COCOA_LIBRARY} 
        ${AVFOUNDATION_LIBRARY} 
        ${COREMEDIA_LIBRARY} 
        ${COREVIDEO_LIBRARY} 
        ${OBJC_LIBRARY}  # Link the Objective-C runtime
    )
elseif (WIN32)
    ...
endif()
...

if (APPLE)
    target_link_libraries(camera_capture PRIVATE 
        ${COCOA_LIBRARY} 
        ${AVFOUNDATION_LIBRARY} 
        ${COREMEDIA_LIBRARY} 
        ${COREVIDEO_LIBRARY} 
        ${OBJC_LIBRARY}  # Link the Objective-C runtime
    )
endif()

Explanation

-x objective-c++ ensures source files are compiled as Objective-C++.
On macOS, link against the Cocoa framework, AVFoundation, and the Objective-C runtime.

Building a macOS Barcode Scanner Application

To build a barcode scanner, no changes are needed for the barcode scanning logic. Follow these steps:

Update the CMakeLists.txt file to include the macOS-specific configuration.

 ...
    
 if(WIN32)
    
     ...
    
 elseif(APPLE)
     set(CMAKE_CXX_FLAGS "-std=c++11 -O3 -Wl,-rpath,@executable_path")
     set(CMAKE_INSTALL_RPATH "@executable_path")
    
     link_directories(
         ${CMAKE_CURRENT_SOURCE_DIR}/../../dist/lib/macos
         ${CMAKE_CURRENT_SOURCE_DIR}/../../../examples/10.x/sdk/platforms/macos
     )
    
     set(DBR_LIBS
         "DynamsoftCore"
         "DynamsoftLicense"
         "DynamsoftCaptureVisionRouter"
         "DynamsoftUtility"
         "pthread"
     )
 elseif(UNIX)
     ...
 endif()
    
 ...
    
 if(WIN32)
     ...
 elseif(APPLE)
     add_custom_command(TARGET BarcodeScanner POST_BUILD
         COMMAND ${CMAKE_COMMAND} -E copy_directory
         ${CMAKE_CURRENT_SOURCE_DIR}/../../../examples/10.x/sdk/platforms/macos
         $<TARGET_FILE_DIR:BarcodeScanner>
     )
 elseif(UNIX)
     ...
 endif()
    
    

Build the application using CMake.

 mkdir build
 cd build
 cmake ..
 cmake --build .

macOS Barcode Scanner

Source Code

https://github.com/yushulx/cmake-cpp-barcode-qrcode-mrz/tree/main/litecam

Mixing Objective-C and C++ in macOS to Build a Camera-Based Barcode Scanner

macOS Camera Demo Video

Implementing Camera-Related Functions for macOS

Updating the Header File for macOS

Querying Cameras

Opening a Camera

Closing a Camera

Capturing a Frame

Implementing Display-Related Functions for macOS

Updating the Header File for macOS

A Custom NSView for Drawing Camera Frames, Contours, and Text

Window Delegate for Handling Window Events

Initializing Window and Content View

Showing the Window

Processing a Keyboard Event

Drawing the Camera Frame, Contours, and Text

Updating CMakelists.txt for macOS

Building a macOS Barcode Scanner Application

Source Code