You may have seen This Person Does Not Exist. If you haven’t, I recommend taking a look and reporting back here before reading further.
Technology has advanced to a point where we can generate photorealistic faces, discernable as fake only by eldritch criteria: iridescent glows, perfectly symmetrical eyes, the wrong number of teeth.
Not only that, thanks to Phil Wang, the creator of This Person Does Not Exist, it’s possible for the layman to train their own image generator (better known as a GAN, or Generative Adversarial Network) in just a few lines of code. With enough time, computing power, and data, you can make your own alternate reality versions of just about anything.
I’d been wanting to train my own GAN for a while, but it wasn’t until I learned about Japanese municipal flags did the project really gain steam. I mean, just look at these things. Wouldn’t the world be better off with more of them?!
But, like I said, training a GAN takes lots of time, computing power, and data.
This is a hobby project with no expectations or deadlines, so time wasn’t a problem.
Thanks to Google Colab, a wonderful offering that lets you run your code with Google’s most powerful GPUs, I had plenty of computing power.
The only catch was data–there are only about 4,000 municipal flags out there. The best, highest-quality GANs are trained with truly insane amounts of data, usually well over 50,000 pictures. My little GAN just wouldn’t be able to output beautiful, crisp flags without a bit of manual intervention.
A bit of manual intervention
To get the quality output I wanted, I decided to manually create high-res flags based on the 128×128 images output by the GAN.
After training my GAN, I generated several thousand images and hand-picked 16 of my favorites. I chose images not only from the final trained model but also from previous checkpoints for more variety.
To be fully transparent, the GAN output a lot more junk than gems. Getting the parameters right is hard, especially with so few training images.
I upscaled each of those 16 images in Photoshop and cleaned them up, sticking as closely as I could to the spirit of the originals. The only thing I changed was to ensure the flags were symmetrical–the originals were usually close but not quite there.
Finally, I wanted to name my new fake municipalities and give them some descriptions.
For both tasks, I used GPT2, a pre-trained text generation model that “knows” quite a lot about the world. You can fine-tune it by feeding it examples of the type of output you want, and within minutes it’ll create text in the style of your inputs. I followed this excellent tutorial by Max Woolf to get the job done.
For the town names, I just fine-tuned GPT2 on a list of Japanese municipalities I pulled from Wikipedia.
For the town descriptions, I used the Encyclopedia Britannica entries of actual Japanese towns.
GPT2 is “smart” enough that it “knows” real Japanese town names, so my generated descriptions all contained names of actual towns. I replaced the real names with my generated fake ones, and lightly edited the descriptions for grammar and punctuation.
And finally, The Cities
Fukuno is well known in Japan for its annual festival, Fuguokkoza, in which large, elaborate floats are paraded through the streets before a contest to see who can move the float, the Fukuoka Phoenix. Fukuno is also the location of several historic sites, such as the ancient city of Fuchu, whose walls guard the city are one of the largest in Japan, and Tenryu, famous for its traditional fire temples. Pop. (2010) 140,109; (2015) 141,333.
Yamagotaka prefecture is a major tourist area, attracting tourists and residents of the surrounding mountainous terrain. The annual sea festival in the city centre features the display of traditional wooden floats and other marine products. The yearning for nature is evident in the way Yamagotaka landscapes and people are managed. For example, more than 90 percent of the province is designated as a Nature Reserve.
Fukukawa developed around the Sengen (Seikan) Temple that was built as a court residence of the king in the 19th century. During the Edo (Tokugawa) period (1603-1867) the city was a commercial centre, a court quarter, and a market for rice, vegetables, and flowers. The processing centre for the resin Oribe palm (Ortus sativus) was opened by the Portuguese in the city in 1888. The city’s carpets are made of the same fibreglass as those worn by the samurai.
Nishima, north-central Honshu, Japan, facing the Sea of Japan (East Sea). It includes the Oki Islands. Most of its area is mountainous, and there is a small coastal plain. Nishima has been of special historical interest to Japanese emperors and queens, and many of the buildings in the city contain relics of the ancient sites.
Mizushima was a castle town during the Edo (Tokugawa) period (1603-1867). The city is now a rubber and egg industry centre, and its products include chemicals, manufacture of stocks and loads of fertilizers and pesticides, and the manufacture of fishing rods and turtlesters clothing. Mizushima Gate, on the bend of the Yodo, is marked by a gigantic black cat that lives at large in the city. Mizushima is best known for its thermal power plant, which began operation in 1960. Pop. (2005) 169,755; (2010) 178,061.
The city of Onami (formerly Utsunomiya) is located in the central part of the Onagawa plain, on the Chikuma River. Its valley constitutes one of the most important agricultural regions in Japan. Onion and garlic (tai choyu) are the mainstays of the local economy, as are certain vegetables such as turnips, spinach, and lettuce. Onami city, its surrounding rural district, and the surrounding mountain clusters of volcanoes and hot springs constitute the Kanto Meteorological Area. Agricultural products of the region include rice, potatoes, sweet potatoes, sorrel, and melons.
Onami city is the site of the Todai Temple and a shrine dedicated to Shinto founder Mori Igarashi. The mountain of Gyokuro (also spelled Gyokuro Temple) is one of the most impressive in Japan. The surrounding area is renowned for its cherry, apple, and peach trees. Pop. (2005) 125,282; (2010) 130,061.
The western part of the prefecture is a vast expanse of land dominated by the Shoshone Range. Only very small areas of semi-arid land are cultivated, and the majority of the prefecture is mountainous. The mountainous plateau at Shiroi, on the Kurobe River, is noted for its ancient granite altars and Muslim shrines.
The city’s landscape is dominated by hills, and its industries are rubber, cement, chemical, and food-processing. The city is also the southern terminus of the East-West Air Tunnel, which facilitated air travel between Tokyo and Yokohama after World War II. Pop. (2005) 113,240; (2010) 116,852.
Koshima city lies about 15 miles (24 km) south of Tokay and stretches for several more miles along the coast. The city was founded as a post town in ancient time. The Kotohira Marinishigei Arlongengo (National Historic Park) is there, and the town became a favourite summer resort. Koshima is now a resort area for those of Japanese folk art, and its resorts and the towns of Serizawa and Tamanoi have modern theatres. Pop. (2010) 25,955; (2015) 24,015.
Takamoto was the seat of the first European printing house, Liberec, in 17th century Japan. Since the Meiji Restoration (1868) the city has been one of the principal commercial and industrial centres of eastern Shizuoka. Takamoto also houses the Takamoto Aquarium, one of the largest in Shizuoka.
An adventurer named Abel Tasman first visited Mitami in the early 17th century.
Mitami is dotted with hot springs and, during May and June, live tropical birds (called sarabashys) that attract many a visitors of all nations. There are also beaches and shopping areas. Near all are the Seven Dwarfs Mountains, one of Japans most celebrated expressions of Japanese folklore. Pop. (2005) 74,394; (2010) 71,999.
Ofu city’s first written record dates to 1152, when a group of refugees fleeing from Nara brought with them hymn books and sacred writing. By the Edo (Tokugawa) period (1603-1867), Ofu was a leading centre of feudal industry, including the production of movable type and the production of paper lacquer.
Sura was a small fishing village before the construction of Sura-Hakodate naval base in the early 20th century. With the opening of a railway connection to Osaka-Kobe International Airport in 1936, Sura became a major transportation centre. The base of the Ishiyama Mountains in the east and the Nikko Plain in the west attract many sea and railway travellers. The city is also a trade centre for agricultural products of the Ishikawa family of farms along the mountain slopes. Pop. (2005) 133,151; (2010) 132,672.
Gotone is highly regarded as a healthy village. The passage of a railway line beneath the city floor in the late 19th century initiated a prosperous industrialization that continued for more than a century. Industry of all kinds, from food processing to construction materials, was developed. The city is the hub of a tourism industry and a base for the Japan Coast Guard. Pop. (2010) 136,691
Miyayama, southeastern Kyushu, Japan, facing the Pacific Ocean. The southern coast contains Nichinan-kaigan Quasi-national Park, which includes the offshore island of Ao and is noted for its tropical and subtropical vegetation, wild horses, and monkeys. In the southwest, Kirishima-Yaku National Park is dotted with volcanoes, craters, and crater lakes. The prefecture is a major honeymoon spot.
Tokabari is one of the four principal islands of the Izu Peninsula, facing the Pacific Ocean on the southern shore of Kyushu. It is mostly mountainous, having a shallow northeastern and a deep southeast. The interior is composed chiefly of a shallow sea. The Abe River, division from Izu, is the major terrestrial river in the region. The northern and northeastern portions of the peninsula are mostly mountainous, and the southern portion, which is at the western foot of Mount Fuji, is a major nesting site of the black-capped koi-toa whale.
A few thoughts
I’m really intrigued by this intersection of code and art. I think there’s a lot of potential in exploring using a GAN to inspire a design and my own human brain to complete it. It opens up a new path of creativity that I’m really excited about…
I’ve cooked up an idea to combine GAN+human designs with embroidery! Watch this space to see if I ever get that done 🙃
…and now, for anyone interested:
How to train your own GAN
Step 1: Create a training dataset
Using Phil Wang’s code, training a GAN is incredibly easy. The hardest part by far is gathering an appropriate training dataset. Ideally, you’ll want to gather at least 10,000 images that are the same dimensions (usually square).
I scraped the municipal flags of Japan from this website and wrote a neat little R script to crop the images to just the central motif. After cropping and cleanup of duplicates, I had about 3,000 images.
This wasn’t quite enough, so I also added their rotations of 90, 180, and 270 degrees to bulk up the dataset a bit.
Step 2: Load your dataset to Google Drive
Pretty simple. Sign up for Google Drive if you haven’t already, then upload your dataset into a folder there. It took about 3 hours for all 12k images to upload.
Step 3: Connect to a GPU runtime on Google Colab
As mentioned above, Google Colab is a wonderful service that lets you execute Python code through your browser. Even better, you can borrow/buy snippets of time on Google’s ultra-powerful GPUs, which is absolutely crucial here.
First, go to Google Colab and create a new notebook.
In the top left corner, rename your notebook to something rememberable.
At the top left, select Runtime > Change Runtime Type > Hardware Accelerator > GPU > Save
At the top right, click Connect to connect to the GPU runtime.
Step 4: Set up your code
Paste in the following to separate code blocks. You can add code blocks by clicking +Code in the top left corner.
This will link your Colab notebook to your Google Drive account:
from google.colab import drive drive.mount('/content/gdrive', force_remount=True)
This will tell you what type of GPU you’re connected to:
gpu_info = !nvidia-smi gpu_info = '\n'.join(gpu_info) print(gpu_info)
This installs Phil Wang’s code to train your GAN:
!pip install stylegan2_pytorch
This is the actual training code. Replace FOLDER with the name of your Google Drive folder containing the training images. Replace NAMEOFYOURPROJECT with whatever name you want (no spaces).
I suggest reading the How To for what all the parameters (e.g. attn-layers, batch-size) mean. These are just what I used; what works for your particular project may be different.
%cd /content/gdrive/My Drive/ !stylegan2_pytorch --data FOLDER/ --name NAMEOFYOURPROJECT --attn-layers [1,2] --batch-size 16 --gradient-accumulate-every 8 --aug-prob 0.25 --aug-types [translation,cutout,color]
This is what your finished notebook should look like:
Step 5: Train your GAN!
One by one, click into each of your code blocks and click the Run arrow at the left.
The first code block will give you instructions to connect to your Google Drive. Follow those before proceeding to the next block.
The second block tells you what type of GPU Google assigned you. Ideally you’ll get a Tesla V100 or P100, the two most powerful options. If you sign up for Google Colab Pro (totally recommended, I did it), you’re pretty much guaranteed to get one of those. If you get something like a K80, your code will still run but it’ll be very very slow.
The fourth block is the real meat of the operation-it’s the actual GAN training. Every 1000 iterations, it’ll save a set of sample images to your Google Drive in the “results” folder.
I just keep an eye on those results until I like what I’m seeing, or until it looks like my parameters have lead to a dead end and I’m not getting to anything worthwhile. Either way, it’ll take at least a day and probably several to make meaningful progress.
If you’ve signed up for Google Colab Pro, I’ve found that I can leave the code running for about 18-20 hours until my session disconnects and I need to manually reconnect to the GPU and re-run all the code blocks in order. If you don’t have Colab Pro, you’ll probably be disconnected every few hours.
And that’s it!
I feel silly sometimes writing these simplistic tutorials, but it did legitimately take me a while to figure out how to get set up. Hopefully this will help somebody else!
Awesome work and I really appreciate the tutorial. Excited to try it. I forget if I first found you on reddit but this absolutely needs to be posted on r/vexillology
Thank you! I’m planning on posting to r/vexillology tomorrow morning 🙂