Small Object Photogrammetry Through Focus Stacking
Woodshedding in a basement
Back in March during our last on-site work week before the COVID-19 shutdown at UConn, I shot and took with me as much useful raw image data as possible from our lab's automated capture system for later photogrammetric post-processing. In addition, I was also able to checkout a small assortment of gear that I thought I would combine with my own resources at home to create a purpose-built photogrammetry rig in the basement.
The resulting kit features an inexpensive popup light tent, an old lazy Susan turntable, and a Cognysis 3X Stackshot macro rail package. I've attached the macro rail to a Manfrotto 405 3-Way, Geared Pan-and-Tilt Head that I screwed onto a spare tripod that I normally keep behind the seat of my truck. A Canon 5D III, my current backup camera, sits atop the Manfrotto head and is mated to a Zeiss Milvus 50mm f/1.4 ZE lens.
With this setup, I have been further customizing a photogrammetry technique for small archaeological lithics that I initially encountered in the 2016 paper, A Simple Photogrammetry Rig for the Reliable Creation of 3D Artifact Models in the Field: Lithic Examples from the Early Upper Paleolithic Sequence of Les Cottés (France). First author, Samantha Porter, has also produced an excellent three-part video series that goes into extensive, articulate detail on the specialized post-processing that is required of the raw data acquired from such a system.
A problem explained
Generally speaking, depth of field becomes shallower as one shortens the camera-to-subject distance. And most camera/lens combinations eventually become diffraction limited when stopped down beyond a certain aperture. This combination of factors works against the straightforward capture of sharp, high resolution images of small objects that could potentially serve to create richer photogrammetric geometry and texture.
Coincidentally pieces like lithics often possess distinctive surface geometry and visible detail that researchers highly value. But these artifacts can be characteristically small as well. With the system that I was spinning up at home, one thing that I wanted to explore was how to gather more useful object detail during initial image acquisition compared to what was normally possible with standard single shot camera capture. In turn, I wished to see how incorporating additional photographic techniques could possibly help refine an established photogrammetric pipeline.
Devising a solution
In the 2D imaging world, digital macro photographers have traditionally employed a method known as focus stacking or z-stacking to effectively achieve greater depth of field. This is accomplished first through a series of photos taken of an object using different focal points, covering a range between the nearest and furthest points of an object in relation to camera position. From the resulting image set or focus bracket, the sharpest regions of each photo's overlapping depth of field are then computationally merged or focus stacked into a single, composite image that is much deeper than any of its component elements.
A common method of focus bracketing involves moving the entire camera/lens combination from one focal point to the next through the use of a macro focus rail. With the Stackshot 3X system, not only can this movement be very precisely calculated and executed, but a given photo sequence can also be programmatically shot with a single touchscreen gesture.
From the controller's Auto-Distance mode, the user can enter both start and stop positions and can also specify the distance of travel for each step in a given series. In my case, to best estimate an appropriate travel distance, I first calculated the depth of field for my specific camera, focal length, f-stop, and subject distance using the formula, DoF≈2u2Nc/f2. Actually, I didn't do that manually. Instead I used DOFMaster's online calculator and came up with the value .96cm for my particular camera and lens set up to shoot at f/11 from a distance of 21.5cm.
In-focus overlap of roughly 50% between adjacent images is a good initial guideline for creating a successful focus stack in post-processing. In turn, the calculated depth of field was divided by 2 to estimate 50% coverage. This value was then converted into millimeters and entered into the Stackshot 3X controller's distance (of rail travel) setting.
Next, the beginning and end positions of the series were assigned using the controller to move the Canon 5D III along the rails while monitoring the camera's live view for correct focus through a tethered laptop. Once completed, the controller was finally able to automatically calculate and display the resulting number of steps required to best meet all entered parameters. From that point on, I could shoot each successive focus bracket by simply hitting the touchscreen's start button. Following each completed series, I would then manually rotate the object 10 degrees on the lazy Susan and commence yet another bracket from the controller.
After testing both Photoshop and Helicon Focus, I chose Photoshop to focus stack each subsequent focus-bracketed series. Its consistent alignment and blending abilities plus its capacity to work in conjunction with Lightroom where camera raw files were initially organized by brackets and batch edited were key attributes in Photoshop's favor. The resulting composite images created by the program were saved out as TIFF files. These were then eventually imported into Agisoft MetashapePro photogrammetry software where 3D data was created from the focus-stacked 2D TIFFs.
To date, test objects have included those that have measured as little as 2.71cm across their longest dimension. See the above illustrations for one such example. For lithics this small and in the shape of thin projectile points, 4 shots per bracket were normally captured. This resulted in a total of 144 shots per complete 360-degree spin which were derived from 36 incremental 10-degree rotations of the lazy Susan. Since a block of kneadable eraser was used to support the lithic first by its base, then by its tip, separate 360-degree rotations were conducted for each object orientation to guarantee full look angle coverage and to avoid occlusions caused by the support. This, factored in with the previously-noted requirements for focus stacking, resulted in 288 total raw images being shot of the piece.
Rendered results: non-stacked vs. focus stacked
Here is a comparison of the same lithic previously 3D rendered from 144 total photographs taken from two look angles per object orientation. This example was shot at a greater camera-to-subject distance and without focus stacking (as displayed in MetashapePro's model view):
And here is the same object rendered in 3D from 288 raw images taken from one look angle per object orientation. In this case, focus-bracketed images were focus-stacked by Photoshop into 72 composite TIFFs:
The following zoomed-in views of the above tandem renderings include a measured 1mm feature that offers a better sense of the object's small scale and comparative perceivable detail between the non-focus stacked and focus stacked versions:
Decimated and compressed versions of these two examples may also be viewed as interactive 3D models on Sketchfab. Feel free to zoom in and pan around that same 1mm detail from the preceding illustrations to better compare the two 3D renderings:
This combination of focus stacking and small object photogrammetry has similarly been employed by Santella and Milner (2017) using examples from paleontology, and Olkowicz et al. (2019) in their recent study of rock fracture morphology. But it can also be applied to other item types as well... for instance, coins and tokens.
This token for Rhode Island's Newport Bridge brought back childhood memories of Del's, the absolutely terrifying Old Jamestown Bridge, and summer day trips from my parents' Central Massachusetts home to Beaver Tail Lighthouse. At 28mm in diameter, the brass piece is similar in size to the lithic examined earlier and likewise demonstrates the benefits of the same choreographed photography and software techniques described herein.
Final thoughts
As we look toward the future, it will be interesting to see if Canon and other manufacturers eventually include in-camera focus bracketing and stacking in some of their models much like Olympus and Panasonic have already accomplished. Though possibly introducing an undesirable workflow black box, this feature could potentially eliminate the complexities of focus rails, additional file management, and the separate program hand-offs outlined here for small objects. If done with accuracy, precision, and a level of transparency such in-camera processing holds the promise of significantly streamlining the creation of high quality source images that are so crucial towards building exceptional 3D geometry and visible detail.