Show HN: Japanese City Name Generator – Using a Simple 3-Layer MLP

(citygen.freemanjiang.com)

26 points | by freemanjiang 392 days ago

7 comments

ranger_danger 392 days ago
I'm not sure why ML is even necessary? Practically every combination of characters (kana characters, where there's always a vowel at the end of each mora unless it's an "n") is already valid and doesn't even sound weird.
Can someone explain how a random() function given a list of kana characters could not produce equally as good names?
[-]
- freemanjiang 392 days ago
  Hmm I'm not convinced that uniform sampling from all possible kana characters necessarily leads to Japanese-sounding city names. I think the actual distribution does have a pattern (eg. yama appearing more frequently).
  Here are 50 ones I got Claude to generate from the uniform distribution: ['wamorumura', 'sohikotake', 'hiteitewau', 'romekarumu', 'nehami', 'miruyake', 'shiyuhaki', 'ahiyo', 'homaso', 'chionohoratsu', 'akusoyo', 'kiuhi', 'karoso', 'suhoheso', 'muchichi', 'mahakekanuto', 'usatsuwotoro', 'namusu', 'sokomeni', 'hakureromake', 'tosukonuka', 'haokehaso', 'nsesutemei', 'womiku', 'noereyasou', 'suyakenosu', 'ritasaifuka', 'ruremoteshi', 'yuhowotsuhie', 'torarenumeho', 'rutsueto', 'hamiakaki', 'sutsuyosano', 'yasotawaku', 'kihaso', 'koairieke', 'hosuriihiwa', 'horotowanno', 'wokiu', 'tanasochiriwo', 'otosetanu', 'rakamotorure', 'hawaniu', 'emoshiratsuhe', 'naroman', 'mohaesa', 'soniruta', 'nofuni', 'kayatakera', 'natayamume']
- asukachikaru 392 days ago
  Because Japanese words aren't simply a string of random characters, like a string of eight English alphabets doesn't suddenly make it meaningful city names such as Reading or Brighton.
stuartcw 392 days ago
If you used the kanji names of the cities and towns it would be a lot more realistic.
I’ve lived in Japan since 1988 and this just seems like a list of jibberish to me. Japanese city names are, like English city names, made up of meaningful components i.e. Newbridge, 新橋,しんばし, Shinbashi. So there is nothing to get a hook on. It’s just syllables.
Try it with 2000 English city names and you will get the same quality of output.
freemanjiang 392 days ago
One thing is that this is trained on an English, character-level representation of kana characters, so it's possible it generates names that are not legal in the Japanese syllabary
[-]
- RestartKernel 392 days ago
  Have you tried approaching this with the kanji instead? That seems like free tokenisation.
cedws 392 days ago
I got Kanegawa, which is a real place, so I'd say it's pretty accurate!
[-]
- fph 392 days ago
  Maybe that name already was in the training set tho?
- ghfhghg 391 days ago
  Can you spell it?
- gammastipend 392 days ago
  [dead]
kazinator 392 days ago
@freemanjiang, you might enjoy jp-hash: https://www.kylheku.com/cgit/jp-hash/about/
neuraldenis 391 days ago
Can you please share a training tutorial? Thank you!
gammastipend 392 days ago
[dead]