User talk:Zzo38: Difference between revisions
No edit summary |
No edit summary |
||
Line 26: | Line 26: | ||
::::: Using unicode does have some disadvantages: it has a large space of codepoints which leads to less efficient encoding schemes (such as utf8), it has modifier/combining codepoints and other control codes, which we wouldn't try to support aside from simply drawing accents over the previous character, and it has multiple forms for the same character using different combining forms (I already have <tt>partially_normalise_unicode()</tt> in unicode.c to clean these up. But ultimately, the engine receives unicode input (as the user types, or when importing files) and aside from normalising composite characters it seems like extra complication to invent our own character set, which is ultimately just going to be based on unicode codepoints anyway, so it would be more correct to call it a subset of unicode. Fonts will still be organised into pages though. So you can add a Hiragana codepage, which provides characters for the unicode Hiragana block, initialised to defaults from an external font file (e.g. ttf), and customise them. As for the inefficiency of utf8, I don't like it, but it matters extremely little, and it's the standard and therefore the sensible thing to use for variable length strings. --[[User:TMC|TMC]] ([[User talk:TMC|talk]]) 17:13, 18 October 2021 (PDT) | ::::: Using unicode does have some disadvantages: it has a large space of codepoints which leads to less efficient encoding schemes (such as utf8), it has modifier/combining codepoints and other control codes, which we wouldn't try to support aside from simply drawing accents over the previous character, and it has multiple forms for the same character using different combining forms (I already have <tt>partially_normalise_unicode()</tt> in unicode.c to clean these up. But ultimately, the engine receives unicode input (as the user types, or when importing files) and aside from normalising composite characters it seems like extra complication to invent our own character set, which is ultimately just going to be based on unicode codepoints anyway, so it would be more correct to call it a subset of unicode. Fonts will still be organised into pages though. So you can add a Hiragana codepage, which provides characters for the unicode Hiragana block, initialised to defaults from an external font file (e.g. ttf), and customise them. As for the inefficiency of utf8, I don't like it, but it matters extremely little, and it's the standard and therefore the sensible thing to use for variable length strings. --[[User:TMC|TMC]] ([[User talk:TMC|talk]]) 17:13, 18 October 2021 (PDT) | ||
::::::OK, but what I am saying is to internally not use Unicode for text handling and use only the encoding of the fonts, and for translation to be performed separately when receiving input (and when exporting if needed). One option can be used to enable or disable character code translation when reading/writing external files (in case you do not want to use Unicode; I will not want to use Unicode). I am not trying to say that OHRRPGCE should define its own character set; it specifically should not, and also should not care (although it would have its own encoding scheme to select pages, similar to what you wrote). Perhaps, when editing fonts, you can specify the number that the game encoding uses, and optionally specify the corresponding Unicode code point (or pair of Unicode code points, which may sometimes be necessary); the Unicode numbers would then be used only for input and not for internal handling. I think that your idea of shift coding is mostly good (although with perhaps a few differences like I mentioned); your scheme is better than SCSU and UTF-C. (I try to write my suggestion/arguments but if you dislike it then you can do it your own way, I suppose. My ideas is merely just that--the idea, only. Note also that I have not worked with OHRRPGCE recently (and don't know when or if I will in future, although one thing I have done recently is to add OHRRPGCE and some related file formats into Just Solve the File Format Problem wiki)) --[[User:Zzo38|Zzo38]] ([[User talk:Zzo38|talk]]) 18:07, 18 October 2021 (PDT) |
Revision as of 17:07, 18 October 2021
Hi
I responded to your RELOAD comments including the textual format at Talk:RELOAD.
To run game.exe faster/slower, use Ctrl+Plus/Minus. That changes number of milliseconds per frame (shown in corner). You can also use the --runfast arg if you want no waiting at all for something like testcases.
You can give negative experience with the giveexperience command. I guess that there's no reason not to allow negative experience from enemies as well.
All of the suggestions are excellent and I'll add any that aren't already there to my TODO list. Several of them will be pretty easy to add. --TMC (talk) 05:56, 29 November 2014 (PST)
Hi, good to hear from you again!
I very recently changed and cleaned up how ohrrpgce_config.ini works (I simplified it a lot as there were too many problematic edgecases) so am likely to start adding many more settings. You have a long list of requests here I could pick through for ones worth the time. In the current wip version you can now put game.gfx.fullscreen = no in ohrrpgce_config.ini to override the default fullscreen setting in all games.
BTW, I've just started on moving to UTF8. I've also designed my own space-efficient 8-bit-extended-ASCII-compatible character encoding to be used in existing fixed-length string fields, to allow more than 256 charcters in the font without switching those fields to UTF8. (Previously we discussed at Talk:RELOAD#Comments_about_file_format.) Have you ever used characters 1-31 in a game, aside from \n and \t? I want to repurpose those for codes such as "switch codepage". Also, I agree that it's better not to move icons in existing fonts out of the C1 control area, but need to think about what to do about characters 161-255 used for icons. --TMC (talk) 19:40, 14 October 2021 (PDT)
- I have used them (with PC fonts imported into OHRRPGCE), although the games have not been released yet. However, hopefully it shouldn't be too difficult to be a global definition (perhaps by the format of the font lump) to disable the new feature if it is necessary to do so; if not, the game can be changed easily enough to support the new format I suppose. ASCII codes 14 ("shift out") and 15 ("shift in") are standard codes for switching code pages, so it may be a good idea to use those (if they are enough; they might not be). Escapes could also be used (including to display characters in the 0-31 positions of fonts, possibly). For existing fonts that do not use 0-31, there is probably nothing to change and it should just work (at least for display). --Zzo38 (talk) 23:27, 14 October 2021 (PDT)
- I forgot that those two ASCII codes exist; I've never seen them used. But my encoding so far uses 12 control codes (it's not finalised) in order to be more compact, plus more for compressed text markup codes such as to change font or colour. Shift In/Shift Out don't match how my encoding works. Aside from \n and \t I'll also avoid \r since some OHR file formats use it, and the first few characters since there are fixed icons there in all OHR fonts, used by Custom - I think I've used those in a game at least once.
- Disabling the new encoding if the game's font is in the old .fnt format seems simple enough that I can do that. I didn't really want to add an explicit setting for it, because it would be useless for almost everyone, dangerous to change, adds more code, etc. --TMC (talk) 05:11, 15 October 2021 (PDT)
- Here is a draft decoder in FB. I'm not happy with the scheme yet. I'll create an article for it soon and write more. (I realised I should implemented the decoder/encoder in C instead so other projects can reuse it.) I found two unicode encoding/compressing schemes that are very similar to mine, SCSU and UTF-C. Neither are popular. Despite being published on unicode.org, it seems almost no implementations of SCSU exist. Plus, SCSU is pretty complex and even UTF-C is more complex than I want, so I think I will use my own. Here is FB code for a decoder.
I tested the scheme out on the FF6 Advance Japanese script (version with kanji). After removing the line prefixes, the characters are 21% ASCII, 39% Hiragana, 16% Katakana, 25% other (Kanji). Without using the "non-jump" codes, the text encodes to 1.706 bytes per character on average (with the additional codes compression will be slightly better). But 10-bit relative characters provide almost no additional benefit over 9-bit (1.707 Bpc) or even 8-bit ones (1.715 Bpc), so I should probably switch to 9-bit ones. --TMC (talk) 06:35, 18 October 2021 (PDT)- I think that treating the text fundamentally as Unicode is a mistake. Although many programs do this, I think that it is a bad idea (and my own projects aren't using Unicode, even when I do want to allow Japanese and other languages). Better would be just to use the font with multiple pages; they would be aligned as appropriate for the kinds of characters being used. A single shifting scheme can be used for the purpose of OHRRPGCE, although what each page means may vary depending on the font in use. (If translation between character sets/encodings is needed, another lump could be added to translate, although most of the program will probably not need such a thing.) (I believe that no one character set can ever be suitable for all uses, neither Unicode nor TRON nor anything else.) So, you probably won't need as much code space as Unicode (although I don't know if some uses might need as much or more, but probably OHRRPGCE needs less), and may wish to distinguish single-byte and double-byte pages (you can use double-byte pages for kanji, perhaps). --Zzo38 (talk) 11:04, 18 October 2021 (PDT)
- Here is a draft decoder in FB. I'm not happy with the scheme yet. I'll create an article for it soon and write more. (I realised I should implemented the decoder/encoder in C instead so other projects can reuse it.) I found two unicode encoding/compressing schemes that are very similar to mine, SCSU and UTF-C. Neither are popular. Despite being published on unicode.org, it seems almost no implementations of SCSU exist. Plus, SCSU is pretty complex and even UTF-C is more complex than I want, so I think I will use my own. Here is FB code for a decoder.
- Using unicode does have some disadvantages: it has a large space of codepoints which leads to less efficient encoding schemes (such as utf8), it has modifier/combining codepoints and other control codes, which we wouldn't try to support aside from simply drawing accents over the previous character, and it has multiple forms for the same character using different combining forms (I already have partially_normalise_unicode() in unicode.c to clean these up. But ultimately, the engine receives unicode input (as the user types, or when importing files) and aside from normalising composite characters it seems like extra complication to invent our own character set, which is ultimately just going to be based on unicode codepoints anyway, so it would be more correct to call it a subset of unicode. Fonts will still be organised into pages though. So you can add a Hiragana codepage, which provides characters for the unicode Hiragana block, initialised to defaults from an external font file (e.g. ttf), and customise them. As for the inefficiency of utf8, I don't like it, but it matters extremely little, and it's the standard and therefore the sensible thing to use for variable length strings. --TMC (talk) 17:13, 18 October 2021 (PDT)
- OK, but what I am saying is to internally not use Unicode for text handling and use only the encoding of the fonts, and for translation to be performed separately when receiving input (and when exporting if needed). One option can be used to enable or disable character code translation when reading/writing external files (in case you do not want to use Unicode; I will not want to use Unicode). I am not trying to say that OHRRPGCE should define its own character set; it specifically should not, and also should not care (although it would have its own encoding scheme to select pages, similar to what you wrote). Perhaps, when editing fonts, you can specify the number that the game encoding uses, and optionally specify the corresponding Unicode code point (or pair of Unicode code points, which may sometimes be necessary); the Unicode numbers would then be used only for input and not for internal handling. I think that your idea of shift coding is mostly good (although with perhaps a few differences like I mentioned); your scheme is better than SCSU and UTF-C. (I try to write my suggestion/arguments but if you dislike it then you can do it your own way, I suppose. My ideas is merely just that--the idea, only. Note also that I have not worked with OHRRPGCE recently (and don't know when or if I will in future, although one thing I have done recently is to add OHRRPGCE and some related file formats into Just Solve the File Format Problem wiki)) --Zzo38 (talk) 18:07, 18 October 2021 (PDT)
- Using unicode does have some disadvantages: it has a large space of codepoints which leads to less efficient encoding schemes (such as utf8), it has modifier/combining codepoints and other control codes, which we wouldn't try to support aside from simply drawing accents over the previous character, and it has multiple forms for the same character using different combining forms (I already have partially_normalise_unicode() in unicode.c to clean these up. But ultimately, the engine receives unicode input (as the user types, or when importing files) and aside from normalising composite characters it seems like extra complication to invent our own character set, which is ultimately just going to be based on unicode codepoints anyway, so it would be more correct to call it a subset of unicode. Fonts will still be organised into pages though. So you can add a Hiragana codepage, which provides characters for the unicode Hiragana block, initialised to defaults from an external font file (e.g. ttf), and customise them. As for the inefficiency of utf8, I don't like it, but it matters extremely little, and it's the standard and therefore the sensible thing to use for variable length strings. --TMC (talk) 17:13, 18 October 2021 (PDT)