Announcement

Collapse

Please use the Hentai ID thread for all hentai ID requests. Click me for link!

The Identification Thread is Here:

http://www.hongfire.com/forum/showthread.php/447081
See more
See less

Interactive Text Hooker - new text extraction tool

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interactive Text Hooker - new text extraction tool

    Interactive Text Hooker (ITH) is a tool to help you extract text from Japanese games.
    It works very like AGTH. if you are familiar with AGTH you will find it easy to work with ITH.
    Right now ITH is not quite stable and under developing. Please help me test it and report any bugs you find.
    Also any suggestion about new features or improvement is welcome.
    Latest ITH 2.3 (2011.7.9). ITH64 1.0 (2011.5.15). 3.0 test.
    Latest engine support module(10.15).


    Manual & Tutorials
    User manual: http://code.google.com/p/interactive...iki/UserManual
    English: http://craneanime.blogspot.com/2011/01/tutorial-ith-interactive-text-hooker.html
    Korean: http://blog.naver.com/foolmaker/30098345502
    Vietnamese: http://vnsharing.net/forum/showthread.php?t=235841
    Just googled these tutorials out. Thanks to the authors of these tutorials. I'm just too lazy to write one.
    If you also write tutorial please send me the link and I will add it here.
    Chinese(need register): http://bbs.sumisora.org/read.php?tid=10997379
    Written by me. Also ITH Chinese version can be found here.

    Links:
    ITH at Google code: http://code.google.com/p/interactive-text-hooker/
    ITH at Google group: http://groups.google.com/group/interactive-text-hooker
    AGTH main thread by Setx: http://www.hongfire.com/forum/showthread.php?t=36807
    AGTH tutorial by fhc: http://www.hongfire.com/forum/showthread.php?t=59189
    Advanced AGTH Video Tutorials by Freaka: http://www.hongfire.com/forum/showthread.php?t=80401
    Translation Aggregator by ScumSuckingPig: http://www.hongfire.com/forum/showthread.php?t=94395
    VirusTotal: http://www.virustotal.com/

    System requirement:
    Intel Pentium4 or later processor. Recommend OS is Windows XP or later.
    Technically your processor must support SSE2 and OS support common control library 6.
    ITH is also assumed to work in 64-bit Windows.

    Basic usage:
    Spoiler

    Please put ITH.exe, ITH.dll and ITH_engine.dll in the same folder.
    To let ITH extract text from a game, click on the process button,
    it will open a dialog with a process list. Find your target and click attach then.
    When you don't want to extract more text from that game, click detach to tell ITH stop extracting.
    After attached, the shorter drop-down list will contain the pid and name of the game you select.
    If ITH could extract some text, the longer drop down list in the main window will have more than one item.
    When you select one text will appeared in the big square.
    Every item is called a thread. Try to go through every item and find if the text is the same
    to the text in the game.

    User-defined hooks:
    Spoiler

    Also UserHook. AGTH uses this term in its thread.
    When the default hooks doesn't give you right text you want, you will need to install a user-defined hook.
    A special string is needed to inform ITH about the hook you want to install.
    This string is called H-code in AGTH terms. Usually it's game and version dependent.
    Refer to AGTH help and Freaka's video tutorial for further information.
    ITH can handle AGTH H-code, so if you have an AGTH H-code for a game, ITH will also work well for that one.
    Input this string right to the process list(the shorter drop down list) and press enter, a new hook will be installed.

    Thread window:
    Spoiler

    You can manage thread linking information and comment here. Thread linking is a mechanism to merge thread.
    Select the sender thread at the top, then select a thread in link to list.
    Click set, a link will be created to that. Notice that no cyclic link is permitted. Like 1>2>3>1.
    Link list will list all thread on this chain one by one.
    Last sentence contains the last sentence from one thread.
    Comment is some text to describe the thread.
    After you have commented some thread, its name will change in the main window.


    Hook window:
    Spoiler

    Click hook in the main window will open a dialog to help you manage hooks.
    But it is for advanced users who are familiar with H-code internal.

    H-code defined by Setx: /H[X]{A|B|W|S|Q|H}[N][data_offset[*drdo]][:sub_offset[*drdo]]@addr[:module[:{name|#ordinal}]]
    addr->Hook Address, data_offset->Data Offset (when data_offset is negative value, sub more 4 from that, e.g. -8 for EAX, but still -4 for EAX in H-code)
    *drdo->Data Indirection when after data_offset, Split Indirection when after sub_offset
    sub_offset->Split Parameter, also sub 4 when negative.
    Module/Function Base(ITH original): Here fill hash values of module and name. You enter a string in the right blank,
    click hash module/function, then hash values is calculated and filled into this 2 blanks.
    CheckBoxes on the left enable correspond function.
    CheckBoxes On the right correspond to charset option.
    A->Big Endian (ITH different from AGTH defination), B->None, W->Unicode
    S->String, Q->String&Unicode, H->Hex value, N->No context.
    Last Char(ITH original), give in string pointer and extract last char of that string.
    Click generate code you will see h-code of this hook in the bottom.
    Notice that module and name is string in AGTH h-code, but ITH can't get string back from hash.
    Click Remove Hook to remove current selected hook from target process and clear all threads from that hook.
    Click Modify Hook to modify current selected hook. In fact original hook is removed and ITH insert new hook base on parameters in hook window.


    Profile:
    Spoiler

    After attached to some process, you can add that to profile. ITH will record its path.
    If you enable auto inject in option window, ITH will monitor process and attach to that whose path has been record.
    You can also assign up to 4 user-define hook codes(h-code) to a record.
    If you enable auto insert in option window, ITH will insert these hooks after attached.
    Hook code contains module/function name will be transform into hash value.
    They represent equivalent hook. Module name is case insensitive while function name is case sensitive.
    Original : change to ! to indicate that's hash value.
    e.g. /HA4@123:foo.bar:abc -> /HA4@123!BD097770!C5840063

    On the left is a list of all games you have attached and added to profile.
    They other three boxes stores information about remote downloaded profiles.
    Click Refresh to list all profiles stored locally. You can update this list with the updater.
    Click a game on the left and click Find, ITH will find a correspond profile according to executable hash value.
    The click import to copy all information and insert hooks.
    .


    Option:
    Spoiler

    Split time: Time interval to insert line break. At least 100.
    Process delay: ITH will check one process if it's in profile. At least 50.
    If there are N processes running on your system, it takes N*PD for one round.
    This is the longest time ITH waits to attach after a process in profile launched.
    Inject delay: ITH delay attach after process in profile found. At least 1000.
    Insert delay: ITH delay insert hooks after attached. At least 200.
    Auto attach: ITH will attach to processes in profile automatically.
    Auto insert: ITH will insert hooks after attached automatically. Notice that auto insert will not work if auto attach is not enabled.
    Time unit is millisecond.
    Suppress: Enable suppress repetition function. This is the case ABCABCABC.
    Clipboard: ITH will copy the last sentence to the clipboard. Other tools which monitors clipboard will make use of it.
    Here "last sentence" means characters from right after the last line break to the current character

    Global filter: Global filter is a customizable filter that will apply to all threads.
    Currently only single character policy is implemented.
    Maybe in the future I will introduce more complex rules into ITH.
    All characters in the filter list will be filterd out before dispatch to correspond thread.
    Therefore those characters will not appear in final output.

    Full space at the beginning is by default filtered. If it's in the middle of a sentence,
    explicitly add it to global filter list.



    Miscellaneous:
    Spoiler

    Top: ITH will stay on top when pushed.
    Clear: Wiped out the text in the current thread.
    Save: Save profile for current game.
    This includes UserHooks, thread links, thread comment, and current select thread.

    Suspend/Terminate thread: You can suspend terminate some thread of some process.
    Select one thread and operation type, then click Execute.
    There's an box in the right upper corner of the process dialog.
    If you enter an function address here then operation will be proceed on all threads with the same start address.

    ITH is able to attach to multiple processes at same time, although it seems useless now.
    When you close ITH while a program, which is already attached by ITH, is still running,
    open ITH once more then ITH will automatically attached to that program.

    Link: You can type L[num1]-[num2] in the command line (without brackets, only number).
    ITH will make a link from thread num1 to thread num2.
    All text thread num1 receives will also be sent to thread num2.

    ITH will remove single character repetition, that is the case AAABBBCCC....


    ITH64:
    Spoiler

    Based on worldwide data taken during June 2010 from Windows Update 46% of Windows 7 PCs run the 64-bit edition of Windows 7.
    It's likely that more and more game engine will have a 64 bit version. Currently already one exist(CMVS64).
    Neither current ITH nor AGTH can hook 64 bit process, since they're all 32 bit program.
    ITH64 is designed to address this problem. It's native 64 bit program. Its internal architecture is reformed to fit the 64 bit environment.
    Although it's possible for ITH64 to hook 32 bit process, I want to leave that task to original ITH currently.
    In other word, ITH64 will NOT hook ANY 32 bit process. Please use original ITH instead.
    Maybe at some future point I will write a compatible layer. Then you need only ITH64 to do all your hook task.

    Usage of ITH64 is almost the same to original ITH. Only difference is about register representation in h-code.
    Original h-code has the following register map:
    EAX -> -4, ECX -> -8 ... EDI -> -20
    New 64 bit version is as this:
    RAX -> 0, RCX -> -8 ... RDI -> -38, R8-> -40 ... R15 -> -78
    It becomes zero-based and the increment is changed from 4 to 8.

    Example code for current CMVS64 engine.
    /HA-40:-48@4E050:cmvs64.exe
    This means that at 4E050 in module cmvs64, r8 contains data and r9 stores split parameter.
    Be aware of architecture difference when writing h-code for ITH64.
    I strongly recommend that new code use a base-offset style to indicate the real address.
    Not only because the address has become longer, but also to avoid problems when the target module is map into random address.


    Why ITH:
    Spoiler

    AGTH is a big success in text extraction.
    With UserHook function it can solve more than 95% current text extract issues.
    But new games usually need H-code to help AGTH working, and common users have no way to write one.
    ITH is designed to recognize much more game engines than AGTH and insert proper hooks automatically.

    1)ITH now can detect many popular game engines.
    Currently KiriKiri, BGI, RealLive, ShinaRio, CMVS, MAJIRO, rUGP, Malie, NitroPlus, Lune, QLIE,
    Apricot, CandySoft, AB2Try, Debonosu, System40, CIRCUS, AtelierKaguya, Waffle, YU-RIS,
    TinkerBell, AbelSoftware, SofthouseChara, LiveMaker, Bruns, CaramelBox, Pensil.
    More will be added later. If you find some engine ITH currently can't detect, feel free to request it here.
    I will then study that engine and try to find a way to detect it.
    General speaking ITH works well without special codes for more than 70% new released games.

    2)ITH has a graphic attach/hook insert interface. You don't need to pass parameters to ITH via cmd or link target line.

    3)ITH is able to insert multiple UserHooks into target process, while AGTH only one.

    4)ITH can join threads together as your wish(Link function), while AGTH will join many together, sometimes with useless threads.
    Since ITH is able to insert multiple UserHooks, this also means you can join text from different hooks together.
    This is useful when the text process function appears at different place.

    5)ITH can detach from process, remove/modify UserHook while the process is running.
    You don't need to restart the process when you find you have inserted wrong hooks.
    Bad hooks won't crash the process, just yield error message.
    This means you can use try-error method to guess hook code more efficiently.

    6)ITH is open source and is under developing. More features will be added to ITH in future versions.

    a) AGTH has option to hook common system routine(/X?), ITH currently only hook APIs in GDI32.dll


    IMPORTANT note:
    I have submit this program to VirusTotal, some anti-virus software report ITH as malware.
    I use NOD32 and it report nothing here. There is some aggressive technique that may be used in virus.
    ITH requires administrative privilege to function properly, means it has potential to damage your computer.
    I promise that original ITH will not
    1)spy programs other than you tell it to attach,
    2)create/modify/move/delete any files without explicit prompt, other than "ITH.pro" and "ITH.ini" resides in its folder,
    In the case of ITH64, it will create "ITH64.pro" and "ITH64.ini" respectively.
    3)create/write/delete any system registry keys,
    4)send/receive any information through network.
    Make sure you have checked hash values to ensure it is original version.

    Hash value of ITH
    Spoiler

    Hash values for current ITH.exe
    MD5 : 339360e57c9940ab33631071947a8e42
    SHA1 : f8a1b98c7b77b0b1fa45a1998bb80c0a6f34aad2
    SHA256: 540a5ec6f6d5092b1d76f96427d8cd344103b256e4526ad460 2b1a25ef1c882e

    Hash values for current ITH.dll
    MD5 : 2685073a5825725d09bb6671f99ac151
    SHA1 : 6e5cffd1886d7d131c91a6afe20244b79f3d89ac
    SHA256: 34548ec22b4e22774255b13c0ac799d13a535c03bcb84a0e4a ad3a06d454ad6e

    Hash values for ITH_engine.dll (2011.7.9)
    MD5 : 6d9cd2bf506aede1bcc40b1db8b116e0
    SHA1 : f066d124ee91de2da0dc89c864390969602aff3b
    SHA256: c61e1fe060d5bf6e69c1b70a122f51e406714e7821ea43761e 0046ad6dc505ce

    Hash values for current ITH_engine.dll (2011.10.15)
    MD5 : c027319d9f652747c2beb9be2cc0a6e7
    SHA1 : cdb6b380e859cf45c6747ac5880d2ac4896c251d
    SHA256: faca50c43ab62b2366d4b927d234dcf4522e01e369c415a9ac 9b1d27dd632d90


    Hash value of ITH64
    Spoiler

    Hash values for current ITH64.exe
    MD5 : 394b168b58e2f8da89fa73c507ea1136
    SHA1 : 412a3fdac7b9ec42bafd30f2fc2f821c25c2513d
    SHA256: 180dea1d34c23260bfbef0d529ecfdea437395d86ca8f44ac6 a1730c44a51b0d

    Hash values for current ITH64.dll
    MD5 : 523089418cc41e410f1c58f71e277b2b
    SHA1 : 251eb10d20d1083e99d2697db30d987c561bd970
    SHA256: 12073ed66c13967818b7778ce3e5ac37f0edd87f4e8adf9b6c 4c02a4849b0511

    Hash values for current ITH64_engine.dll
    MD5 : 2657858b2beb04dc104adcccbb343691
    SHA1 : 6bb730cbfc7715d5b8c5f56c008923d6d25a5ebf
    SHA256: 78b4ae62b19e04bce75ba2bdd5bf7ca3c1d5db3e95df53db9e 8c2117329e6a35


    About the source:
    Spoiler

    From 2.2 source code of ITH is under GPLv3. Older source is no longer available.
    ITH is written in C++ and inline assembly, compiled by VC10.0.
    A ready-for-compile project pack is also uploaded. Please get ntdll.lib and msvcrt.lib from latest WDK.
    Since I begin develop ITH with VS10.0 so maybe it's inconvenient for those under 10.0.
    I develop this program in WIN7 64 so it's assumed to work well under both 32 and 64bit OS.
    Attached Files
    Last edited by kaosu; 03-02-2012, 08:27 PM.
    sigpic
    Got stuck at AGTH H-codes? Have a try of ITH, supports more game engine.
    http://www.hongfire.com/forum/showthread.php?t=208860

  • thanks you all for help me Oh the new ver, let me try it!
    p/s: Somehow is the newest ver ITH is suitable the game Marguerite Sphere, it can take the all line normal without repeat line or missing word thanks kaosu so much
    Last edited by Tsuchiya90; 04-24-2011, 12:50 AM.

    Comment


    • @foolmaker: Thanks for your tutorial. Seems there's a lot of comments about your tutorial.
      If there is any comment about bug or improvement please help to report to me, in English
      Yeh, i'm a writer of a tutorial in Korean. FIRST OF ALL, many Koreans appreciate to you for your developing, ITH.
      And as you said, there are several troubles in using ITH and most of all are related with /KF option.
      I'll hand out v2.2 to Koreans and hopefully report back soon if bugs exist.

      some requirements:
      * It's just my personal ask... I hope ITH also support 'Livemaker' engine (AGTH's new supported game engine.)
      It is very widely used for dojinshi game

      * Link function (typing L[Num1]-[Num2] in command line) doesn't work well. No reply for Link command (ver2.2)

      * With using Thread Editor for Link, ITH sometimes break & closed after a link setting. (ver2.2)
      Maybe, It happens when there are many sentences in the linked threads

      Best regards.
      Last edited by foolmaker; 04-24-2011, 10:26 AM.

      Comment


      • While initializing this game, i get the following out of ITH:

        Inject process 3600. Module base 02940000
        3600: OKAIPA.exe
        3600: Pipe connected.
        3600: 000025EB
        3600: 00400000:00C98000
        3600: 5ECFE5CD3D83347297F198A5C812A836
        3600: Unknown engine.
        3600: Initialized successfully.

        No text hooks are generated while playing for me to use - I've tried AGTH as well, and that doesn't generate any usable hooks either. These programs work with other games, increasing my confusion. I've tried this on my Macbook running Parallels Windows XP and on a desktop machine using the same operating system. Any help would be appreciated.

        Copy/pasted from my thread, also a similar problem seems to occur with Hitozuma Cosplay Kissa 2

        Comment


        • Originally posted by levantino View Post
          no, I mean you still need to add "/x" but I'm suggesting that ITH have that /x feature automatically (just like /c and some other engine)
          I don't know if I'm getting confused or just missing something here. How exactly do you add the /x code using ITH? Is it the same as the /H text hook insertion? Been screwing around with ITH, but I can't figure out how to do it.

          Comment


          • * After linking btw userhooks, each userhook cannot hook text anymore...

            Comment


            • does any1 know if this ITH works with any emulators, for example PCSX2 ? i would love to get the rough translation of the untranslated ending of akai ito ( if i get it working first with PCSX2 XD )

              Comment


              • New beta version here. Fixed link&thread issues.

                @foolmaker: LiveMaker support will be added to 2.2 release.

                @Tanukiking: The game may need a h-code. If you give me a download link to a working version I may study that case.

                @Mishagal: ITH currently doesn't work with emulators. If you provide me enough test cases(>=3) I may include support for that.
                Last edited by kaosu; 05-02-2011, 07:35 AM.
                sigpic
                Got stuck at AGTH H-codes? Have a try of ITH, supports more game engine.
                http://www.hongfire.com/forum/showthread.php?t=208860

                Comment


                • Do you accept requests for code?

                  If yes then i would really like one for 神採りアルケミーマイスター (link here)

                  Story text comes out fine in one thread.
                  But quest info, choices and and stage info text only gets captured once, has 2 threads.

                  Someone messed around with a memory editor or something and fixed it (messy tough, according to his post).

                  Something about font caching.

                  hoping for a code soon.
                  and yeah theres no code in the agth thread already checked.
                  Nutaku info
                  sigpic

                  Comment


                  • * some problems/Questions.

                    1. The ITH does not hook thread anymore after I returned to the first(initial) page, when playing Alicesoft, 大帝国.
                    It needs detach/attach of ITH again for hooking game text

                    2. When I'm playing 穢翼のユースティア of オーガスト,
                    it doesn't response anymore with a certain alert window and process need to be killed.
                    Look attached image of alert message.

                    3. Needs for bruns engine (used for モーニングスター game). just inputting the below hook option in the command line
                    (I find that the Newest ITH version can read this hook option in the command line )

                    Code:
                    /HW4@0:msvcp90.dll:?push_back@?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@QAEXG@Z
                    Attached Files
                    Last edited by foolmaker; 04-29-2011, 06:26 AM.

                    Comment


                    • Older games come with msvcp80.dll, so it could be also /HW4@0:msvcp80.dll:?push_back@?$basic_string@GU?$char_traits@G@std@@V?$allocator@G@2@@std@@QAEXG@Z The forum force inserts a space character after 50 characters, so it's not 'ch ar', but a continuous string. You can bypass that using a tag, like this $ch[b][/b]ar_traits@

                      Comment


                      • @foolmaker: 1.About 大帝国, I studied this issue and find that it's out of the scope of current ITH.
                        After you return to the title, SYSTEM40 VM restart itself, all other DLLs are wiped out from memory and reloaded.
                        So the hook address is changed. Current solution is like you said, detach and attach ITH again so ITH can determine new hook address.

                        2. I didn't encountered this case. Could you post a savefile here which may trigger that error window?
                        Also I need detail information such as OS, directory path, actions before error, etc.
                        BGI.exe is packed and may perform integrity check. I'm not sure of that but if so we will need a pretty aggressive(malicious) solution then...

                        3. Ok I will study that. Btw, the code need not to be that long if you encoded in 32 bit hash(have a look at hook window)
                        After all, users need not to input this code again after 2.2 release.

                        @Freaka: Do you have any old games in hand? It will be great help if you can give me some of them.
                        I will then study the difference of the engine and include proper hook code.
                        Last edited by kaosu; 04-29-2011, 06:22 AM.
                        sigpic
                        Got stuck at AGTH H-codes? Have a try of ITH, supports more game engine.
                        http://www.hongfire.com/forum/showthread.php?t=208860

                        Comment


                        • Not really at hand, if you look at http://blgames.proboards.com/index.c...lay&thread=250 it's basically all games that have a msvcp80.dll hook. I'm not even sure if the usage of msvcp80.dll is game dependent, or might be system dependent.

                          Comment


                          • * 1. Request for removing blinking characters
                            when I played the sisters~夏の最後の日~ or several titles of DELTA company before,
                            ITH hooks text well but also blinking special character like ▽, ◇, ▲..., etc, too.
                            Because of blinking characters, texts are vanished in an instant. So it is hard to read texts.
                            (Sorry for my poor description, see the attachment.)

                            ps. ) In AGTH, something different technique maybe exist for removing blinking special characters.
                            It can remove it...

                            * 2. FYI
                            Originally posted by kaosu View Post
                            2. I didn't encountered this case. Could you post a savefile here which may trigger that error window?
                            Also I need detail information such as OS, directory path, actions before error, etc.
                            BGI.exe is packed and may perform integrity check. I'm not sure of that but if so we will need a pretty aggressive(malicious) solution then..
                            When I use the operate the game by a NTLEA, It happens randomly.
                            I think this symptom would be disappeared by using Applocale.
                            So, thanks for your interests, but don't need to waste your time.
                            Thank you
                            Attached Files
                            Last edited by foolmaker; 04-30-2011, 10:07 PM.

                            Comment


                            • 2.2 is out. ITH is under GPLv3 from now on.
                              Added engine support: AbelSoftware, FrontWing(for old frontwing games), LiveMaker, Bruns
                              Soon I will release ITH64 1.0.

                              @foolmaker: It's difficult to determine which characters are needed and which are not.
                              So I will include a custom filter mechanism in the next version. Then you can specify which you want then.
                              Last edited by kaosu; 05-02-2011, 06:59 AM.
                              sigpic
                              Got stuck at AGTH H-codes? Have a try of ITH, supports more game engine.
                              http://www.hongfire.com/forum/showthread.php?t=208860

                              Comment


                              • 1. Bug report

                                Originally posted by kaosu View Post
                                2.2 is out. ITH is under GPLv3 from now on.
                                Added engine support: AbelSoftware, FrontWing(for old frontwing games), LiveMaker, Bruns
                                Soon I will release ITH64 1.0.
                                I just test ITH v2.2 now with kirikiri engine game, [110428] [TRYSET] サマー☆きゃんぷ.
                                Now ITH only hooks kirikiri1!!

                                But I guess there may be a small mistake(delay bug) in an algorithm which ITH copies texts to the clipboard.
                                It copies to the clipboard not a current sentence, but a former sentence (one line before).

                                example)
                                【良樹】う、うん‥‥別に、いいけど‥‥。

                                【瞳】あっ‥‥ごめんね。わたし、別のお友だちにも挨拶しなくちゃ‥‥良樹くん、またあとでねー!

                                -> What is copied to the clipboard.
                                【良樹】う、うん‥‥別に、いいけど‥‥。

                                2.
                                Originally posted by kaosu View Post
                                @foolmaker: It's difficult to determine which characters are needed and which are not.
                                So I will include a custom filter mechanism in the next version. Then you can specify which you want then.
                                I assume that AGTH dose not omit some special characters but excludes one blinking character, maybe.
                                But If it is difficult to determine about that, custom filter will be greatly helpful not only removing some characters
                                but also working like a pre-filter of translation mechanism.
                                Last edited by foolmaker; 05-02-2011, 03:43 PM.

                                Comment

                                Working...
                                X