No, it can't. Three-player mahjong is a completely different game.
pt refers to the same concept of Tenhou ranking pt. Simply put, they are the weighted version of final placements at the end of the game.
Unfortunately, I don't know, because I can't find a legitimate way to test it.
Tenhou rejected my AI account request for Mortal because Mortal was developed by an individual rather than a company.
I (Equim) have no affiliation to them. I am not running any AI in ranked lobbies and will not do so until an official permission is granted.
Check Mortal's documentation for details.
Technically, all visible information on the board is taken into account, including discard sequences with tedashi info, current points, round number and so on, but not information such as each player's level, historical stats, thinking time of each move, game lobby type, etc.
If you're referring to the deal-in rate column in akochan, Mortal does not have it; in fact, it was never explicitly calculated by Mortal in the first place. Mortal and akochan are two entirely different mahjong AI engines, created by different developers with different designs. So you probably shouldn't expect them to share any features.
For instance, if the game has pt setting
Player | Score | 1st place (%) | 2nd place (%) | 3rd place (%) | 4th place (%) |
---|---|---|---|---|---|
East | 29000 | 29.532 | 32.512 | 27.416 | 10.539 |
South | 14200 | 7.621 | 9.907 | 17.006 | 65.466 |
West | 27200 | 24.857 | 29.048 | 31.777 | 14.317 |
North | 29600 | 37.990 | 28.533 | 23.800 | 9.677 |
Note that these probabilities are estimates of the final rankings at the end of the whole game, not after the current kyoku.
To get the
Wrapping up,
-
$\hat Q^\pi(s, a)$ is not 局収支 (round EV). -
$\hat Q^\pi(s, a)$ is not pt. -
$\hat Q^\pi(s, a)$ is not 清算ポイント (end game score).
I have been considering whether to just remove the column or not, but in the end I decided to keep it as is. Just look up
A lower
- (❌) This move is worse.
- (✅) The AI is less interested in trying this move.
(Mortal) Why do all actions except the best sometimes have significantly lower Q values than that of the best?
As mentioned above,
This is an exploitation vs exploration dilemma. To begin with, Mortal is model-free, which means it cannot determine the actual Q value
ELI5: Mortal is optimized for playing, not reviewing or attribution.
Mortal is an end-to-end deep learning model that deploys model-free reinforcement learning, therefore we are unlikely to be able to do any significant attribution work on it. If you insist on wanting a reason for a decision made by Mortal, I would say that in contrast to how humans play, Mortal is not based on so-called "precise calculations", but rather just "intuition".
Not really. This is an intentional feature, and in the case shown in the figure, it is a rule-based fail-safe strategy against アガラス (win-to-be-last-place) in the all-last round.
The single-line output (starting with Mortal:
) is the actual final decision made by the AI, while the expanded table provides additional, intermediate information that is totally optional and may be altered or even removed in a future version. When they are in conflict, the single-line output should take precedence. Furthermore, the table is just a by-product of the AI, and focusing too much on building it may hinder finding better ways to build a stronger AI.
Edit jun_pt
in tactics.json
. Note that there is a hard-coded bound of
Akochan is not good at kan. Akochan also has numerical stability issues in extreme situations.
Akochan is very aggressive about its sole goal - the "final" pt EV, instead of just winning this round.
where
Why square? Nothing special but just to please the human eye. Since the raw calculated value is usually very close to 1, squaring it makes it harder to get closer to 1.
The calculation is essentially a basic min-max scaling and the result has a high variance. It is also directly tied to the output dynamic range of a specific engine (model). It shouldn't be considered a reliable measurement.