Developing an Image Synthesizer
(October 6, 2020)

Hi I'm going to talk about the development of Image Synthesizer. It is a in-browser app programmed using a mix of Javascript and GLSL(WebGL), designed to generate new image patterns from existing. You can see an example of what it does in the image to the right.

If you would like to see the finished code, you can download it here.

It is no where close to perfect and that's specifically what I want to blog about today. This is something I've been attempting to do for years and it originally started as text generation. I was attempting a crude implementation of Markov Chains in Javascript, you can see the bad text generation here. After seeing the slurring/drunken sounding text that was generated I thought "cool, what if I could apply this to images?".

Text Generator
The text generator records the probability of which symbol(character) should come after the last and generates text by sampling those probabilities. To give a example of how the text generator works take the example string 'wooow'. Count the characters, after 'w' comes 'o', etc... You will record:
'w' leads to 'o' 1 time
'o' leads to 'o' 2 times and 'w' 1 time
That gives us our probability values now we can randomly sample those to generate text. I've provided Javascript code you can run in the console to demonstrate the random sampling.
//our probabilities
var probabilities = {"w":{"o":1}, "o":{"o":2, "w":1}};

//start with 'w'
var gen = "w";

//generate 9 more random characters afterwards
var last = gen;
for (var i = 0; i < 9; i++) {
	//generate next character based off probabilities of last
	var prob = probabilities[last],
		sum = 0;
	//sum probabilities because we need to normalize them to sample them
	for (var c in prob) sum += prob[c];
	//generate random value scaled by sum this is our sample index
	var rind = Math.random()*sum;
	//search through probabilities to find the area matching our random index
	var ind = 0;
	for (var c in prob) {
		var v = prob[c];
		ind += v;
		if (ind >= rind) {
			//this gives us our next character c
			gen += c;
			last = c;

//output generated in console

Executing the code should generate something like 'wooowooowo', it's a ghost.

Images are not Text
Starting from the text generation code, checking if 2 parts of text are equal is simple just check equality. Doing the same for pixels isn't so simple, there's many different ways of comparing pixels. Euclidean distance, manhatten distance, max difference or converting to another color format that matches human perception like YUV first. I ended up just using plain euclidean distance.

That was the small issue, in english text the direction is always left to right but a pixel in a 2D image has 9 neighbouring pixels and 9 directions. Since theres so many combinations of neighbouring patterns in even a small image the approach of counting up probabilities becomes too memory expensive, at least for my liking. But now I had a rough idea of what I needed, a function that generates probabilities directly from the source pattern image.

Not Quite Probabilities
Our function is input a source pattern image and a partly synthesized image. The function then searches through all the pixels in the source input finding the largest chunks that match the partly synthesized input. These largest most similar chunks are added into an array called foundIds which is returned by the function. Now instead of an array of probabilities its just a plain array of sample points. Thankfully this ends up working the same because matching areas will appear multiple times in the array of sample points, giving the same effect of higher probability! Anyways the best explanation is the Javascript code itself:
var patternSize = 5,//the max extent of the pattern that is searched
quantization = 5;//this is the maximum difference 2 pixels can be to be considered equal

function search(sourceData,sourceWidth,sourceHeight,
synthX,synthY,synthData,synthWidth,synthHeight) {
	var maxFound = 0,
		foundIds = [];
	//search through source image
	for (var searchY = 0; searchY < sourceHeight; searchY++) {
		for (var searchX = 0; searchX < sourceWidth; searchX++) {
			//search pixel and expand outward counting matching
			var size = 0,
				totalFound = 0;
			while (size < patternSize) {
				//expand search area starting with 1x1, then cover 3x3, then 5x5, etc around search x,y until we find no matches at all
				var found = 0,
					szw = Math.max(1,sz+sz);
				for (var ly = -sz; ly <= sz; ly++) {
					var lstride = (ly===-sz||ly===sz)?szw:1;
					for (var lx = -sz; lx <= sz; lx += lstride) {
						//source pixel, skip if out of bounds
						var sourceX = searchX+lx, sourceY = searchY+ly;
						if (sourceX < 0 || sourceX >= sourceWidth ||
							sourceY < 0 || sourceY >= sourceHeight) continue;
						//synthesized pixel, skip if out of bounds
						var lsynthX = synthX+lx, lsynthY = synthY+ly;
						if (lsynthX < 0 || lsynthX >= synthWidth ||
							lsynthY < 0 || lsynthY >= synthHeight) continue;
						//calculate pixel data array indices
						var sourceIndex = (sourceX+sourceY*sourceWidth)*4,
							synthIndex = (lsynthX+lsynthY*synthWidth)*4;
						//check if 2 pixels equal, add to found count if equal
						var redDiff = sourceData[sourceIndex]-synthData[synthIndex],
							greenDiff = sourceData[sourceIndex+1]-synthData[synthIndex+1],
							blueDiff = sourceData[sourceIndex+2]-synthData[synthIndex+2];
						//euclidean distance
						if (Math.sqrt(redDiff*redDiff+greenDiff*greenDiff+blueDiff*blueDiff) < quantization)) {
				//exit if none found
				if (!found) break;
				totalFound += found;
			//add pixel chunk to array if most similar
			if (totalFound) {
				if (totalFound > maxFound) {
					maxFound = totalFound;
					foundIds.length = 0;
				if (totalFound === maxFound) foundIds.push(sourceX+sourceY*sourceWidth);//write out pixel index for sampling
	//return found
	return foundIds;

One of the key parts is in the middle of the loops where it compares pixels against the quantization amount.
The reason for this is with 8 bits per pixel and thats 255 colors and the odds of 1 pixel being exactly equal is very low with so many colors. The solution to this is quantization, reduce the number of values the pixels can be from 0-255 to something manageable like 0-15.

My initial plan was to start the synthesized image as noise seen below. Then run the search function across each pixel one by one in scan lines, sampling the matches that end up being similar to our noise. Time to try it! Feeding in the image below as source pattern you can see the resulting synthesis on the bottom right.
Initial Noise
Source Pattern
Synthesis Result

As you can probably tell, it did not work effectively at all and just changed the color of the noise. At this point I also realized I needed a much simpler source image pattern as a base line, because with so many details its hard to even tell what details are coming through in the noise. This black and white tiling pattern works especially nice because its already quantized for us.

Now it's time to ditch the noise, lets try synthesizing from a blank canvas. Instead of searching the whole synthesized image we need to limit it to only search the rows of pixels above synthX and synthY. This is so it won't compare pixels that aren't filled in yet. Change the synthesized out of bounds check as seen below.
//synthesized pixel, skip if out of bounds
var lsynthX = synthX+lx, lsynthY = synthY+ly;
if (lsynthX < 0 || (lsynthY === synthY && lsynthX > synthX) || lsynthX >= synthWidth ||
	lsynthY < 0 || lsynthY > synthY) continue; 

Without any starting noise we simply select a random pixel from the source to begin synthesizing, then process pixels in scanlines going downward.
Running through the tiling pattern as source input, you can see it actually generates the pattern and tiles it!

You can decrease the accuracy of the pattern by reducing pattern size, the image above was synthesized with patternSize=5. Look what happens with patternSize=1, where only 3x3 areas of pixels are searched. It creates a very cool effect of a broken but similar pattern.

Now for the final test, our detailed source input! I used the same initial generation options of quantization=5, patternSize=5. There's 2 synthesized results to highlight how the synthesis is random each time.
Source Pattern
Synthesis Result
Synthesis Result 2

It generated recognizable, but at the same time new, images! It's obviously not perfect but all things considered, I'm very happy with the results!
Below is what the final synthesis Javascript code looks like:
function synthesize(synthWidth,synthHeight, sourceData,sourceWidth,sourceHeight) {
	//create buffer to store synthesized pixels
	var synthData = new Uint8Array(synthWidth*synthHeight*4),
		synthIndex = 0;
	//fill in first top left synthesis pixel with random sample
	var randInd = Math.floor(Math.random()*sourceWidth*sourceHeight)*4;
	for (var i = 0; i < 4; i++) synthData[synthIndex++] = sourceData[randInd++];
	//loop through rest of pixels
	for (var y = 0; y < synthHeight; y++) {
		for (var x = y===0?1:0; x < synthWidth; x++) {
			//run our search function to find matches
			var found = search(sourceData,sourceWidth,sourceHeight, x,y,synthData,synthWidth,syntHeight);
			if (found.length) {
				//randomly sample one of matches
				randInd = found[Math.floor(Math.random()*found.length)];
			} else {
				//if no matches, fill in randomly
				randInd = Math.floor(Math.random()*sourceWidth*sourceHeight);
			randInd *= 4;
			for (var i = 0; i < 4; i++) synthData[synthIndex++] = sourceData[randInd++];
	//return synthesized pixels
	return synthData;

Issues and Potential Improvements
-Artifacts of the synthesised patterns following the direction the pixels being filled in left-right, top-down. This can easily be fixed with post processing mirror/flipping the image.

-Better color comparison, I said above YUV would probably give better results. There is also currently no gamma correct which causes non-uniform color densities in some images.

-Optimization, I haven't found any alternative to brute force searching all surrounding pixels that achieves the same result. Not sure if this is possible.

Thanks for Reading!
If you have any questions feel free to reach out to me on Twitter or Discord.