This is my personal study notes of the awesome project Handmade Hero.
If you think writing a professional-quality game from scratch on your own (no engine no library) is interesting and challenging, I highly recommend this project.
In my opinion, it's the best I can find.
- We can jump π
- We can shoot π«
- We can go upstair and downstair πββοΈ
- We have a big world and many rooms πΊ
- We have a basic procedure generated ground πΏ
Windows 10 with Visual Studio 2019 community version and Sublime Text 3.
Build system for Sublime Text 3:
{
"build_systems":
[
{
"name": "HandmadeHero",
"shell_cmd": "build",
"file_regex":"^(.+?)\\((\\d+\\))(): (error)(.+)$"
}
]
}
NOTE: This repo does not contain copyrighted HandmadeHero assets, to build this repo, please consider preorder HandmadeHero.
- Create a
w
drive using subst:subst w: /c/whatever_directory_you_choose
- Clone this repo into the root of
w
- Install Visual Studio 2019 community version
- cd into
w
and init cl:.\handmade-hero\misc\shell.bat
- Build and enjoy!
build
My preferred code style for C is different from Casey's.
- snake_case for types, e.g.
game_world
- camelCase for variables, e.g.
globalRunning
- PascalCase for functions and macro functions, e.g.
GameUpdateVideo
- UPPER_SNAKE_CASE for macro constants, e.g.
TILES_PER_CHUNK
- Prefix an underscore to indicate that this function should only be called with a corresponding macro, e.g.
_PushSize
NOTE
: Something we need to pay attention toPLAN
: Something we plan to do it laterRESOURCE
: External valuable resourceDIFF
: Something I have done it differently from CaseyFUN
: Something interesting to know, like Windows can't correctly handle file formats they inventedCASEY
: Casey's opinion about programming
- Every memory allocation should go through a macro, it will make the debugging much easier.
Premultiplied Alpha
: check day 83 for more details.Gamma Correction
: check day 94 for more details.Transform Normal
: check day 102 for more details.
dir /s [keyword]
: search filesfindstr -s -n -i -l [keyword]
: find strings
WS_EX_TOPMOST
: make window in front of othersWS_EX_LAYERED
andSetLayeredWindowAttributes
: change window alpha
Spy++
: inspect windows and messages
- Fix full screen problem caused by systeml-level display scale
- Fix long running freeze bug: let the game run for a while and it will freeze
- Slow speed when moving across rooms
- Day 1: Setting Up the Windows Build
- Day 2: Opening a Win32 Window
- Day 3: Allocating a Back Buffer
- Day 4: Animating the Back Buffer
- Day 5: Windows Graphics Review
- Day 6: Gamepad and Keyboard Input
- Day 7: Initializing DirectSound
- Day 8: Writing a Square Wave to DirectSound
- Day 9: Variable-Pitch Sine Wave Output
- Day 10: QueryPerformanceCounter and RDTSC
- Day 11: The Basics of Platform API Design
- Day 12: Platform-Independent Sound Output
- Day 13: Platform-Independent User Input
- Day 14: Platform-Independent Game Memory
- Day 15: Platform-Independent Debug File
- Day 16: Visual Studio Compiler Switches
- Day 17: Unified Keyboard and Gamepad Input
- Day 18: Enforcing a Video Frame Rate
- Day 19: Improving Audio Synchronization
- Day 20: Debugging the Audio Sync
- Day 21: Loading Game Code Dynamically
- Day 22: Instantaneous Live Code Editing
- Day 23: Looped Live Code Editing
- Day 24: Win32 Platform Layer Cleanup
- Day 25: Finishing the Win32 Prototyping Layer
- Day 26: Introduction to Game Architecture
- Day 27: Exploration-based Architecture
- Day 28: Drawing a Tilemap
- Day 29: Basic Tilemap Collision Checking
- Day 30: Moving Between Tilemaps
- Day 31: Tilemap Coordinate Systems
- Day 32: Unified Position Representation
- Day 33: Virtualized Tilemaps
- Day 34: Tilemap Memory
- Day 35: Basic Sparse Tilemap Storage
- Day 36: Loading BMPs
- Day 37: Basic Bitmap Rendering
- Day 38: Basic Linear Bitmap Blending
- Day 39: Basic Bitmap Rendering Cleanup
- Day 40: Cursor Hiding and Fullscreen
- Day 41: Overview of the Types of Math Used in Games
- Day 42: Basic 2D Vectors
- Day 43: The Equations of Motion
- Day 44: Reflecting Vectors
- Day 45: Geometric vs. Temporal Movement Search
- Day 46: Basic Multiplayer Support
- Day 47: Vector Lengths
- Day 48: Line Segment Intersection Collision
- Day 49: Debugging Canonical Coordinates
- Day 50: Basic Minkowski-based Collision Detection
- Day 51: Separating Entities by Update Frequency
- Day 52: Entity Movement in Camera Space
- Day 53: Environment Elements as Entities
- Day 54: Removing the Dormant Entity Concept
- Day 55: Hash-based World Storage
- Day 56: Switching from Tiles to Entities
- Day 57: Spatially Partitioning Entities
- Day 58: Using the Spatial Partition
- Day 59: Adding a Basic Familiar Entity
- Day 60: Adding Hitpoints
- Day 61: Adding a Simple Attack
- Day 62: Basic Moving Projectiles
- Day 63 & 64 & 65 & 66: Major Refactoring with Simulation Region
- Day 67: Making Updates Conditional
- Day 68: Exact Enforcement of Maximum Movement Distances
- Day 69: Pairwise Collision Rules
- Day 70: Exploration To-do List
- Day 71: Converting to Full 3D Positioning
- Day 72: Proper 3D Inclusion Test
- Day 73: Temporarily Overlapping Entities
- Day 74: Moving Entities Up and Down Stairwells
- Day 75: Conditional Movements Based on Step Heights
- Day 76: Entity Heights and Collision Detection
- Day 77: Entity Ground Points
- Day 78: Multiple Collision Volumes Per Entity
- Day 79: Defining the Ground
- Day 80: Handling Traversables in the Collision Loop
- Day 81: Creating Ground with Overlapping Bitmaps
- Day 82: Caching Composited Bitmaps
- Day 83: Premultiplied Alpha
- Day 84: Scrolling Ground Buffer
- Day 85: Transient Ground Buffers
- Day 86: Aligning Ground Buffers to World Chunks
- Day 87: Seamless Ground Textures
- Day 88: Push Buffer Rendering
- Day 89: Renderer Push Buffer Entry Types
- Day 90: Bases Part 1
- Day 91: Bases Part 2
- Day 92: Filling Rotated and Scaled Rectangles
- Day 93: Textured Quadrilaterals
- Day 94: Converting sRGB to Light-linear Space
- Day 95: Gamma-correct Premultiplied Alpha
- Day 96: Introduction to Lighting
- Day 97: Adding Normal Maps to the Pipeline
- Day 98: Normal Map Code Cleanup
- Day 99: Test Environment Maps
- Day 100: Reflection Vectors
- Day 101: The Inverse and the Transpose
- Day 102: Transforming Normals Properly
- Day 103: Card-like Normal Map Reflections
- Day 104: Switching to Y-is-up Render Targets
- Day 105: Cleaning Up the Renderer API
- Day 106: World Scaling
- Day 107: Fading Z Layers
- Day 108: Perspective Projection
- Day 109: Resolution-Independent Rendering
- Day 110: Unprojecting Screen Boudaries
- Day 111: Resolution-independent Ground Chunks
- Day 112: A Mental Model of CPU Performance
- Day 113: Simple Performance Counters
- Day 114: Preparing a Function for Optimization
- Day 115: SIMD Basics
- Day 116: Converting Math Operations to SIMD
- Day 117: Packing Pixels for the Framebuffer
- Day 118: Wide Unpacking
- Day 119: Counting Intrinsics
- Day 120: Measuring Port Usage with IACA
- Day 121: Rendering in Tiles
- Day 122: Introduction to multithreading
- Day 123: Interlocked Operations
- Day 124: Memory Barriers and Semaphores
- Day 125 && 126: Work Queue
- Day 127: Aligning Rendering Memory
- Install Visual Studio 2019
- Call
vsdevcmd
to init command line tools - Use
cl
to build our program - Use
devenv
to start visual studio to debug. e.g.devenv w:\build\win32_handmade.exe
WinMain
: Entry of Windows programMessageBox
: Show a message box
WNDCLASS
,RegisterClass
GetModuleHandle
OutputDebugString
DefWindowProc
CreateWindow
,CreateWindowEx
GetMessage
,TranslateMessage
,DispatchMessage
BeginPaint
,EndPaint
,PatBlt
PostQuitMessage
- #define
global_variable
andinternal
tostatic
- Resize buffer when receive WM_RESIZE
GetClientRect
CreateDIBSection
StretchDIBits
DeleteObject
CreateCompatibleDC
ReleaseDC
- Use
VirtualAlloc
to alloc bit map memory instead ofCreateDIBSection
VirtualFree
,VirtualProtect
- Set
biHeight
to negative value so we the image origin if top-left - Render a simple gradient. Each pixel has a value of form
0xXXRRGGBB
- use
PeekMessage
instead ofGetMessage
, because it doesn't block GetDC
,ReleaseDC
HREDRAW
andVREDRAW
are used to tell Windows to redraw the whole window- Use
win32_offscreen_buffer
to bundle all global variables - Create the back buffer just once, move it out of
WM_SIZE
XInput
,XInputGetState
,XInputSetState
,XUSER_MAX_COUNT
- Loading windows functions ourselves
- Use XInput 1.3
LoadLibrary
,GetProcAddress
WM_SYSKEYUP
,WM_SYSKEYDOWN
,WM_KEYUP
,WM_KEYDOWN
- Get IsDown and WasDown status from LParam
- Return
ERROR_DEVICE_NOT_CONNECTED
in xinput stub functions - Implement
Alt+F4
to close the window - Use bool32 if we only care if the value is 0 or not 0
dsound.h
, IDirectSound8 InterfaceDirectSoundCreate
,SetCooperativeLevel
,CreateSoundBuffer
,SetFormat
- Remember to clear
DSBUFFERDESC
to zero - Add
MEM_RESERVE
toVirtualAlloc
- IDirectSouondBuffer8 Interface
Lock
,Unlock
,GetCurrentPosition
,Play
sinf
win32_sound_output
,Win32FillSoundBuffer
tSine
,LatencySampleCount
- We need to handle xinput deadzone in the future
- Use
DefWindowProcA
instead ofDefWindowProc
QueryPerformanceCounter
,LARGE_INTEGER
,QuyerPerformanceFrequency
wsprintf
,__rdtsc
- Intrinsic: looks like a function call, but it's used to tell the compiler we want a specific assembly instruction here
- Win32 platform todo list:
- Saved game location
- Getting a handle to our executable
- Asset loading
- Threading
- Raw input (support for multiple keyboards)
- Sleep/timeBeginPeriod
- ClipCursor() (for multimonitor)
- Fullscreen
- WM_SETCURSOR (control cursor visibility)
- QueryCancelAutoplay
- WM_ACTIVATEAPP (for when we are not the active application)
- Blit speed improvements
- Hardware acceleration
- GetKeyboardLayout (for French keyboards)
- For each platform, we will have a big [platform]_handmade.cpp file. Inside this file, we #include other files.
- Treat our game as a service, rather than the operating system.
_alloca
: Allocate some memory in the stack, freeed when the function exists rather than leave out of the enclosing scope- Move sound rendering logic to handmade.cpp
- Define
game_input
,game_controller_input
,game_button_state
- Store OldInput and NewInput and do ping-pang at end of every frame
- Define
ArrayCount
macro
- Use a
game_memory
struct to handle all memory related stuff - We have permannent storage and trasient storage in our memory
- Define
Kilobytes
,Megabytes
andGigaBytes
macros - We require the memory allocated to be cleared to zero
- Define
Assert
macro - Use
cl -Dname=val
to defineHANDMADE_INTERNAL
andHANDMADE_SLOW
compiler flags - Specify base address when we do
VirtualAlloc
for debugging purpose in internal build
- Define
DebugPlatformReadFile
,DebugPlatformWriteFile
andDebugPlatformFreeFileMemory
only when we are using internal build - Define
SafeTruncateUInt64
inline functions CreateFile
,GetFileSizeEx
,ReadFile
__FIEL__
is a compile time macro points to current file
- VS compiler switches:
-WX
,-W4
: enable warning level 4 and treat warnings as errors-wd
: turn off some warnings-MT
: static link C runtime library-Oi
: generates intrinsic functions.-Od
: disable optimization-GR-
: disable run-time type information, we don't need this-Gm-
: disable minimal rebuild-EHa-
: disable exception-handling-nologo
: don't print compiler info-FC
: full Path of Source Code File in Diagnostics
- Init
vsdevcmd
using-arch=x86
flags to build a 32-bit version of our program - Use
/link
to pass linker options to make a valid Windows XP executable-subsystem:windows,5.1
- Add one controller, so we have 5 controllers now
- Extract
CommonCompilerFlags
andCommonLinkerFlags
in build.bat - Copy old keyboard button state to new keyboard button state
- Add MoveUp, MoveDown, MoveLeft, MoveRight buttons
- Handle XInput dead zone
- Check whether union in game_controller_input is aligned
- We need to find a way to reliably retrieve monitor refresh rate?
- We define
GameUpdateHz
based onMonitorRefreshHz
- Use
Sleep
to wait for the remaining time - Use
timeBeginPeriod
to modify scheduler granularity
- Record last play cursor and write cursor
- Define
Win32DebugSyncPlay
to draw it - Use a while loop to test direct sound audio update frequency
The audio sync logic is indeed very hard and complicated.
I didn't take many notes because I was really confused and I didn't understand much.
- Compute audio latency seconds using write cursor - play cursor
- Define
GameGetSoundSamples
- Compile win32_handmade and handmade separaely
- Define win32_game_code and
- Put platform debug functions to game memroy
- Enable
/LD
switch to build dll - Use /EXPORT linker flags to export dll functions
- We don't need to define
DllMain
entry point in our dll - extern "C" to prevent name mangling
- Turn off incremental link
- Use
CopyFile
to copy the dll
NOTE: CopyFile
may fail the first time, We use a while loop to do it. This is debug code so We don't care the performance;
- Use /PDB:name linker options to specify pdb file name
- Add timestamp to output pdb file name
- Delete PDB files and pipe del output to NUL
- Use
FindFirstFile
to get file write time - Use
CompareFileTime
to compare file time - Use
GetModuleFileName
to get exe path and use it to build full dll path - We can use MAX_PATH macro to define length of path buffer
- Define
win32_state
to store InputRecordIndex and InputPlayingIndex, we only support one slot now - Press L to toggle input recording
- Store input and memory into files
- Use a simple jump to test our looped editing
- We can use
WS_EX_TOPMOST
andWS_EX_LAYERED
to make our window the top most one and has some opacity - We can do it in
WM_ACTIVATEAPP
message so when the game loses focus it will be transparent
- Fix the audio bug (I have already fixed that in previous day)
- Change blit mode to 1-to-1 pixles
- Use
%random%
for pdb files - Change compiler flag
MT
toMTd
- Store EXE directory in Win32State and put record input file to build dir
- Use
GetFileAttributeEx
instead ofFindFirstFile
to get last write time of a file
- Use
GetDeviceCaps
to get monitor refresh rate - Pass
thread_context
from platform to game and from game to platform - Add mouse info to game_input, using
GetCursorPos
,ScreenToClient
- Record mouse buttons using
GetKeyState
- Define win32_replay_buffer and store game state memory in memory using
CopyMemory
(Storing in disk actually is very fast in my computer, but I am gonna do it anyway) - I am not gonna do memory mapping, because I think it's unnecessary
Today there isn't any code to write. I am just listening to Casey talking about what a good game architecture looks like.
In Casey's view, game architect is like a Urban Planner. Their job are organizing things roughly instead of planning things carefully. I can't agree more.
- Add
SecondsToAdvanceOverUpdate
to game_input - Remove debug code
- Turn off warning C4505, it's annoying. We are gonna have unreferenced local functions.
- Target resolution: 960 x 540 x 30hz
- Define
DrawRectangle
- Use floating point to store colors, because it will make it a lot more eaiser when we have to do some math about colors
- Draw a simple tilemap
- Draw a simple player, keep in mind that player's moving should consider the time delta. Otherwise it will move fast if we run at a higher FPS.
- Using
PatBlt
to clear screen when display our buffer - We should only clear the four gutters otherwise there will be some flashing
- Implement a simple collision check
- Seperate the header file into handmade.h and handmade_platform.h
- Define four tilemaps, and notice that in C the two dimension array is Y first and X last
- Define
canonicol_position
andraw_postion
- Implement moving between tilemaps
- NOTE: Basically any CPU we are gonna target at has SSE2
- Define
handmade_intrinsic.h
- Define
TileSizeInMeters
andTileSizeInPixels
- Optimization switches:
/O2 /Oi /fp:fast
- PLAN: Pack tilemap index and tile index into a single 32-bit integer
- PLAN: Convert TileRelX and TileRelY to resolution independent world units
- RESOURCE: Intel Intrinsics Guide, https://software.intel.com/sites/landingpage/IntrinsicsGuide/
- Remove
raw_position
- Add
canonical_postion PlayerPos
to game state - Define
RecononicalizePosition
- Use meters instead of pixels as units
- Rename
canonical_position
toworld_position
- Make Y axis go upward
- RESOURCE: a great book about typology: Galois' Dream: Group Theory and Differential Equations
- Remove TileMapX and TileMapY
- Define
tile_map_position
- 24-bit for tilemap and 8-bit for tiles
- Implement a simple scroll so the guy can move
- Implement smooth scrolling
- Implement a way to speed the guy up
- Make
TileRelX
andTileRelY
relative to center of tile - Create
handmade_tile.h
andhandmade_tile.cpp
- Rename
tile_map
totile_chunk
and extract everything fromworld
totile_map
, now tilemap means the whole map - Define
memory_arena
andPushSize
,PushArray
and create tile chunks programmatically
- Make tile size small so we can see more chunks
- Remove
TileSizeInPixels
andMetersToPixels
fromtile_map
- Use random.org to generate some random numbers and use them to generate screen randomly
- Generate doors based on our choice
- Allocate space for tiles only when we access
- Add Z index to tilemap
- FUN: Windows can't render BMPs correct. This is very amusing, because they are the guys who invented BMP.
- Make player go up and down. I already implemented this function in the previous day, but I need to reimplement it in a new way: when the player moves to the stair, it goes automatically, no need to push any button.
- Rename
TileRelX
andTileRelY
toOffsetX
andOffsetY
- Define
bitmap_header
and parse bitmap. We have to use#pragma pack(push, 1) and #pragma pack(pop)
to make vs pack our struct correctly
- Design a very specific BMP to help debug our rendering. This is a very clever method.
- I find that my structured_art.bmp has different byte order from casey's. It turns out that BMP has something called RedMask, GreenMask, BlueMask and AlphaMask.
- BMP byte order: should determined by masks
- Render background bmp
- Define
loaded_bitmap
to pack all things up - Define
DrawBitmap
- Define
FindLeastSignificantSetBit
andbit_scan_result
in intrinsics - Define
COMPILER_MSVC
andCOMPILER_LLVM
macro variables - Use
_BitScanForward
MSVC compiler intrinsic when we are using windows - Implement a simple linear alpha blending
- Assert compression mode when loading BMP
- Load hero bitmaps for four directions
- Change hero direction when moves
- Align hero bitmaps with real position
- Replace camera scrolling with fixed camera
- Move camera when player moves
- Fix clipping problem in our bitmap drawing
- Check frame rate
- Fix msvc pdb problem when hot reloading by creating a lock file
- Write a
static_check
bat file to make sure we never typestatic
- Set a default cursor style using
LoadCursor
- Hide cursor by responding
WM_SETCURSOR
message withSetCursor(0)
in production build - RESOURCE: How do I switch a window between normal and fullscreen? https://devblogs.microsoft.com/oldnewthing/20100412-00/?p=14353
- Implement full screen toggling
- Do fullscreen rendering in fullscreen mode
Math we are gonna need:
- Arithmetic
- Algebra
- Euclidean Geometry
- Trigonometry
- Arithmetic
- Calculus
- Linear Algebra
- Partial Differential Equation
- Ordinary Differential Equation
- Complex Numbers
- Non-Euclidean Geometry
- Topology
- Minkowski Algebra
- Control Theory
- Interval Arithmetic
- Graph Theory
- Operations Research
- Probability and Statistics
- Cryptography / Pseudo Number Generator
- Fix diagonal movement problem
- Define
v2
and implement add operator, minus operator and unary minus operator inhandmade_math.h
- Use v2 instead of x and y
- Add
dPlayerP
to game state. This is the speed of the guy. - Add a back force based on player's speed
- Implement inner product for vectors
- Reflect speed when player hits the wall (or make the speed align the wall). This can be implemented by a clever verctor math
v' = v - 2 * Inner(v, r) * r
. r means the vector of the reflecting direction. For bottom wall, r is(0, 1)
.
- CASEY: Search in p (position) is way better than searing in t (time)
- Part of new collision detection algorithm
- There is no code today. I will write the new collision detection algorithm when it's complete.
- We have a severe bug! Player has been moved multiple times!
- Define
entity
struct. AddEntities
,EntityCount
,PlayerIndexForController
andCameraFollowingEntityIndex
to game state - Support as many players as our controllers in game state
- Implement
RotateLeft
andRotateRight
intrincsics using_rotl
and_rotr
- Define
Length
,SquareRoot
to fix diagonal movement problem - We will ue search in t instead of search in p. Because to implement the later, we have to build the whole search space. It's complex and doesn't pay off.
- Part of new collision detection algorithm
- Implement the new collistion detection algorithm
- Add a
tEpsilon
to tolerate floating point problem
- Add an
Offset
method to manipulate tile map position and auto recononicalize - Maybe we shouldn't make the world toroidal, since it adds much complexity
- Introduction of Minkowski sum and GJK algorithm
- Implement area collision detection
- Take player area into account when calculating MinTileX, MaxTileX, MinTileY and MaxTileY
- Modify speed when player hits the wall
- Use a loop to move player
- Divide entities into high, low and dormant categories
- Define
entity_residence
enum - Casey did part of the new implementation
- No code today. I will wait till the new implementation is finished
- Make player move again
- Map float position to tile map position after moving player
- Make camera move again
- Define
SetCamera
and move entities into/out of high set - Define
entity_type
and add wall entities - Remove
tRemaining
in collision detection
- Remove dormant entity and entity residence concept
- Define
MakeEntityHighFrequency
andMakeEntityLowFrequency
- Make code work again
- DIFF: I don't like all the index thing. I will use pointers instead.
- Use int32 as chunk index, so 0 will be the center.
- Add
TileChunkHash
to tile map GetTileChunk
should take a memroy arena- DIFF: I will store pointers instead of index in TileChunkHash array
- Rename
CameraBound
toHighFrequencyBound
- Rename
handmade_tile.h/cpp
tohandmade_world.h/cpp
- Rename tile chunk to world chunk. We dont have tiles anymore.
- There is no tiles any more, just chunks.
- Define
entity_block
andChangeEntityLocation
- Implement
WorldPositionFromTilePosition
- Reimplement
SetCamera
using spatial partition - Call
ChangeEntityLocation
when adding low entities - Load tree bitmap and render it as wall
- Add monster and familiar entity type
- Define
entity_render_piece
andentity_render_piece_group
- Implement
UpdateFamiliar
- Define
hit_point
struct - Draw hit points
- Define
v3
andv4
vectors
- CASEY: Always write the usage code first. It will prepare you necessary context for writing real stuff.
- Add
EntityType_Sword
entity type - Define
DrawHitPoints()
andInitHitPoints()
and add hitpoints for our monster - Load rock03.bmp as sword and render it when some key is pressed
- Define
NullPosition()
andIsPositionValid()
. Use some specific value to represent a null position.
- Define
move_spec
and pass it toMoveEntity
- Add
distanceRemaining
to sword - Define
UpdateSword
and make sword disappaer when distance remaining reaches to zero
This is a big change but it defeinitely worth it.
- Remove
low_entity
andhigh_entity
. They are never a good idea. - Define
sim_entity
andstored_entity
.stored_entity
is for storage andsim_entity
is for simulation. - Every frame, pull relevant entities to our simulation region, simulate it and render it.
- Lots of modifications adjusted for this new model
- Add
updatable
to sim entity and set it correspondingly - Add
updatableBounds
to sim region. Previous bounds becomes total bounds. LoadEntityReference
should get position from reference entityUpdateSword
doesn't have to check NonSpatial flag- Move update logic back to our main function
- CASEY: Avoid callbacks, plain switch statements are just better on every aspect.
- Consider
distanceLimit
in moveEntity function - CASEY: Fight the double dispatch problem with a property system.
- Define a simple
HandleCollision
function to make sword hurt monster when they collides
- Remove
EntityFlag_Collides
- Define
ShouldCollide
to check whether two entities should collide - Define
pairwise_collision_rule
- Add
collisionRuleHash
andfirstFreeCollisionRule
to game state - Define
AddCollisionRule
andClearCollisionRulesFor
- One way to fix
ClearCollisionRulesFor
function: every time we just insert two entries so that we can query with each one.
To-do list:
-
Multiple sim regions per frame
- Per-entity clocking
- Sim-region merging? For multiple players?
-
Z!
- Clean up things by using v3
- Figure out how you go "up" and "down", and how is this rendered?
-
Collision detection?
- Entry/exit?
- What's the plan for robustness? / shape definition?
-
Debug code
- Logging
- Diagramming
- Switches / slides / etc.
-
Audio
- Sound effect triggers
- Ambient sounds
- Music
-
Asset streaming
-
Metagame / save game?
- Do we allow saved games? Probably yes, just only for "pausing".
- Continuous save for crash recovery?
-
Rudimentary world generation
- Placement of background things
- Connectivity?
- Non-overlapping
- Map display
- Magnets - how they work???
-
AI
- Rudimentary monster behavior example
- Path finding
- AI "storage"
-
Animation system
- Skeletal animation
- Partical system
-
Rendering
-
GAME
- World generation
- Entity system
- Remove
world_diff
- Remove
chunkSizeInMeters
and addchunkDimInMeters
- Define
Hadamard
for v2 and v3 - Make
p
anddP
v3 in sim entity - Define
rectangle3
- Implement the simple jump (Casey implemented this long time ago)
AABB
: Axis aligned bounding boxes- Add
maxEntityRadius
,maxEntityVelocity
to sim region - Change
width
andheight
in sim entity todim
- Define
EntityOverlapsRectangle
and use this method to test whether entity is inside a rectangle
- Add
EntityType_Stairwell
and use rock_02 bmp as our stairwell asset - Implement
AddStair
and draw our stair - Define
overlappingCount
andoverlappingEntites
andRectanglesIntersect
- Pass
wasOverlapping
toHandleCollision
- Move
AddCollisionRule
to handle collision
- Remove overlapping stuff and define
CanOverlap
andHandleOverlap
- Rename
ShouldCollide
toCanCollide
- Call
HandleOverlap
at the end ofMoveEntity
- Draw our stairwell as a rectangle
- Define
GetBarycentric
- Define
SafeRatioN
,SafeRatio0
andSafeRatio1
- Define
Lerp
- Add
EntityFlag_Moveable
- Rename
AddFlag
->AddFlags
,ClearFlag
->ClearFlags
- Define
Clamp
andClamp01
- Fix
BeginSim
to loop over chunkZ - Modify stairwell z so that the minimum z of its volumn is 0
- Add
EntityFlag_ZSupported
- Prevent player from "jumping" when he goes up/down stairs
- Define
SpeculativeCollide
to prevent hero from stepping out the stair and jumping into the stair - Add
zFudge
when rendering
- Take into account
z
inMoveEntity
. Remember to set height for walls. - Change
TileDepthInMeters
, currently it's just the same asTleSizeInMeters
. - Modify
RectanglesIntersect
- Define
AddGroundedEntity
- TODO: need to fix the rendering!
- Fix ground handling, need to take the z dimension into account
- Fix the drawing code
- Define
GetEntityGroundPoint
and fixSpeculativeCollide
- Add
walkableHeight
to entity which is used only for stairwell, and modifySpeculativeCollide
- Define
GetStairwellGround
and fixHandleOverlap
. It should use the same method to calculate the stairwell ground asSpeculativeCollide
.
- The position point doesn't necessarily have to be the collision point
- Define
sim_entity_collision_volume
andsim_entity_collision_volumn_group
- Remove
dim
in entity and addcollision
- Define
walkableDim
for stairwell - Initialize collision groups when initialize memory
- Always initialize collistion to null collision
- Set z drag to 0
- Casey talks about difference between "filled and carve" (Quake way) vs "empty and fill" (Unreal way) model
- CASEY: Robustness > efficiency!
- Introduce the concept of "room"
- Define
AddStandardRoom
- Define
PushRectOutline
to draw the room
- Define
test_wall
and make wall testing data driven - Inline
TestWall
function - Test overlap using all volumes and extract code into
EntitiesOverlap
- Add
epsilon
toEntitiesOverlap
- Add test for
tMax
, mostly the same astMin
- Test our new code that prevents hero from ever getting outside
- Load grass, ground and tuft bitmaps
- Define
DrawTest
and randomly draw some grasses, grounds and tufts - Casey talks about megatexture
- Make random number more systemic
- define
random_series
- define
Seed
,RandomChoice
,RandomUnilateral
,RandomBilateral
,RandomBetween
- replace old random code with above new functions
- define
- Make
loaded_bitmap
has the same structure asgame_offscreen_buffer
and all drawing functions previously taking game_offscreen_buffer now take loaded_bitmap - Define
MakeEmptyBitmap
, remember to clear the data to zero! - Draw ground bitmap once and cache it in game state
- Casey explains what premultiplied alpha is
- Change
LoadBMP
andDrawBitmap
function to use premultiplied alpha - Handle
cAlpha
inDrawBitmap
- Add
groundBufferP
and draw ground based on this position - Rename
DrawTest
toDrawGroundChunk
and fill the whole buffer
- Introduce
transient_state
to help manage transient memory - Define
transientArena
and store multiple ground buffers in transient memory - Use
groundBitmapTemplate
to store repeated info (width, height) about the ground buffer - DIFF: Casey uses
beginTemporaryMemory
andendTemporaryMemory
calls to restore memory space used only in one frame. I think the api is not easy to use, I implementsave
andrestore
just like in theCanvasRenderingContext2D
. - Draw ground buffers
- Make world position
_offset
relative to center point, and rename it to beoffset
. There is no need to prefix it with the underscore. - Draw chunks to see how big it is
- Define
CenteredChunkPoint
- Define
DrawRectangleOutline
- Define
- Change
metersToPixels
to a fixed number - Cleanup: remove tileSideInMeters, we no longer have any tile thing.
- Fill ground buffer for each chunk.
- Modify
FillGroundBuffer
to generate seamless grounds by iterating nine chunks each time - Select the furthest buffer and fill it if we have run out of buffers
- Decrease
groundBufferCount
to test our eviction code - Regenerate ground when game reloading
- Add a field
executableReloaded
in game_input to tell us whether game has reloaded
- Add a field
- Why the trees are wiggling around?
- Our bliting is not pixel perfect now, entities' float coordinates may round to different integers and cause their distance to change a little bit.
- We will solve this problem when we have a real renderer!
- Clean up rendering stuff
- Create
handmade_render_group.h
andhandmade_render_group.cpp
file - Put
render_piece
andrender_piece_group
to our newly created file - Increase piece count in piece group and use transient arena to alloc our piece group
- Define
render_basis
- Delayed rendering: render pieces after we have pushed them all
- Create
- Use delayed rendering for ground buffers
- Rename
render_piece_group
torender_group
- Implement push buffer
- Add
pushBufferBase
,pushBufferSize
andmaxPushBufferSize
to render_group - Define
AllocateRenderGroup
andPushRenderElement
- Add
- Why we use a push buffer to do the rendering?
- Sorting!
- Process the source buffer into someting most suitable for the target
- Architect our soft renderer the way actual GPU works
- Move all drawing functions to handmade_render_group.cpp
- Rename
render_piece
torender_entry
- Define
RenderGroupToOutput
- Use "compact discriminated union"
render_entry_clear
render_entry_rectangle
render_entry_bitmap
- Define
render_entity_basis
to abstract common positioning logic - RESORUCE: The ryg blog
- Implement
Clear
- Implement
PushRectOutline
- Use
PushBitmap
inFillGroundBuffer
- Casey explains what is a basis and how it works
- Implement a demo
render_entry_coordinate_system
to explore the basis transformation idea
- Collision detection has a lot to do with pixel filling
- Casey demonstrates how to fill a rectangle
- Define
DrawRectangleSlowly
- Start from an aligned rectangle
- Move to a rotated rectangle
- Calculate the min/max bound rather than always check the whole buffer
- Define
Perp
function
- Define
- Implement a textured quadrilaterals
- For each pixel, calculate the
u
andv
uniform coordinate - Use
u
andv
to get color from texture - Populate pixel with that color
- Implement alpha blending, just copy the old code
- For each pixel, calculate the
- Subpixel rendering
- Casey demonstrates wiggling
- And then solve it by Bilinear Texture Filtering
- Casey explains what Gamma Space is
- It's non linear and it makes our math broken
- RESOURCE
- Use
pow(2.2)
andsqrt(2.2)
to convert between sRGB and linear spacepow(2.2)
is just a good approximation, is not suitable for all monitors- We use
pow(2)
to approximatepow(2.2)
because it's much more cheaper
- Implement
SRGB255ToLinear1
andLinear1ToSRGB255
- Implement a simple color tint
- When we load a BMP
- Convert it to linear space
- Multiply alpha with the color
- Convert it back to sRGB space
- Remove
render_entry_header
in render entry types- Add this header in every render entry type is very error-prone
- Let's the
PushRenderElement
function do this job
-Zo
used for enhanced optimized code debugging
I am reading this book Computer Graphics from Scratch these days, it's a good source to learn about lighting.
- Only render render_entry_coordinate_system and turn the optimization flag off
- Doug Church: Lighting is the sound of graphics.
- Casey explains things about lighting and there are so many terms I don't understand...
- Get lighting fully right is extremely hard
- Lighting problems in 2D
- We don't know what the surfaces are
- normal maps
- We don't know what the light field is
- point lights
- light rendering
- We don't know what the surfaces are
- RESOURCE: A good book about lighting Physically Based Rendering:From Theory To Implementation
- Introduce normal map and environment map
- Define
environment_map
- Add top, middle and bottom environment map and normal map to
DrawRectangleSlowly
- Define
- Define
SampleEnvironmentMap
- Define
MakeSphereNormalMap
to generate a fake normal map and test our code
- There are two types of bitmaps: front-facing bitmaps and up-facing bitmaps
- Clean up previous code
- Initialize top, middle and bottom env maps
- Note: Out roughness is always zero now
- Fill LOD with color and draw the LODs
- Fill LOD with checker board
- Define testDiffuse and testNormal to test our lighting program
- Casey demonstrates how to change saturation
- avg = (r + g + b)/3
- delta = (r - avg, g - avg, b- avg)
- color = avg + saturationLevel * dela
- Calc the reflection vector: -e + 2Inner(e, N)N
- Modify
SampleEnvironmentMap
- Take the reflection vector as input
- Define distanceFromMapInZ, let's say it's 1.0f
- Define uvsPerMeter, let's say it's 0.01f
- Calculate the point in the environment map and get color from it
This is a math day. Casey explains matrices and other stuff of linear algebra.
Speaking abuot lingear algebra, I highly recommond this book Linear Algebra by Jim Hefferon. It's freely available and totally accessible.
Inverse of a rotation matrix:
X and Y are perpendicular unit vectors.
R =
Xx Yx
Xy Yy
R inverse = R' =
Xx Xy
Yx Yy
Based on this fact:
XxXx +XyXy = Inner(X, X) = 1
XxYx + XyYy = Inner(X, Y) = 0
YxXx + YyXy = Inner(X, Y) = 0
YxYx + YyYy = Inner(Y, Y) = 1
So R' R =
1 0
0 1
Because normals are perpendicular to vectors, they are affected in a perpendicular way by any transforms we do.
- Rotate the normal
- Document
SampleEnvironmentMap
function - Paint the LOD to debug SampleEnvironmentMap
In a 2D perspective, things are intentionally wrong, because the art wants them to be different.
- Fix
DrawRectangleSlowly
- Calculate the correct screenSpaceUV
- Add
z
in environment_map - Set
z
for environment maps
- There are two types of cards:
- lying-down card
- standing-up card
- Define
MakeSphereDiffuseMap
- TODO: The mechanism of lighting is very confusing for me now, need to review them later.
- Switch to Y-up render targets. I don't need to do anything cause I did this long before.
- Pull out render api
- Remove
PushPiece
- Make alignment baked in the bitmap
- Remove
entityZC
- Unify v2 offset and offsetZ into v3 offset
PushBitmap
should accept a v4 color
- Remove
- Use
DrawRectangleSlowly
to render bitmap so we can scale - Store
zOffset
in game state and control it using action up/down buttons - Scale the position and size based on Z
- Remove y offset caused by z
- Z Slices are what control the scaling of things, where Z offsets inside a slice are what control Y offseting
- Remove
zOffset
in game_state - Do not preserve the offset z of cameraP
- Add
globalAlpha
to render group - Fade entities based on its z value
- Define
fadeTopStartZ
,fadeTopEndZ
,fadeBottomStartZ
andfadeBottomEndZ
- Define
Clamp01MapToRange
- Define
- Modify
cameraBounds
- Change
GetEntityRednerBasisResult
function to implement proper perspective projection- the core formula: p' = (dp) / (Cz - Pz)
- NOTE: the
cameraP
in game_state is where we are looking at, not the actual camera position
- Change
align
toalignPercentage
- Add
widthOverHeight
to loaded_bitmap - Remove
metersToPixels
in game_state PushBitmap
should take aheight
param- Add
size
to render_entry_bitmap - Add
screenDim
param toGetEntityRenderBasisResult
- Define
GetCameraRectangleAtDistance
- Define
Unproject
- Define
- Add
MetersToPixels
to render group and it means meters on the monitor into pixels on the monitor - Use
PushRectOutline
to verify our GetCameraRectangleAtDistance returns correct value - Define
render_group_camera
- Add
gameCamera
andrenderCamera
to render group - Now we can see the big picture of our game world
- Add
- Reenable ground buffer code
- Make ground the same size as the chunk
- Define another LoadBMP call to set a default center align
- Use meters in
FillGroundBuffer
This is a blackboard day. No code evolved.
- SIMD is everywhere
- Modern CPUS are heavily heavily out of order
- Casey explains the difference between latency and throughput
- In most cases, we only care the throughput not the latency.
- Basic process of making things run quickly
- Gather statistics
- where it is slow
- what are their characteristics
- Make an estimation
- Analyze "efficiency" and "performance"
- "efficiency" is about how much work we have to do
- "performance" is about how to make the CPU do the work
- Start coding
- Gather statistics
- Define
BEGIN_TIMED_BLOCK
andEND_TIMED_BLOCK
macros to track performance - Define
debug_cycle_counter
struct to store counters - Define
HandleDebugCounters
to display counters
- Copy
DrawRectangleSlowly
toDrawRectangleHopefullyQuickly
- Flatten
DrawRectangleHopefullyQuickly
- Think about the question: What is our "wide" strategy?
- SOA(Struct of Array) vs AOS(Array of Struct)
- C makes AOS really easy
- but SIMD needs SOA
- We are targeting SSE and SSE2
- Convert
FillRectangleHopefullyQuickly
to operate on 4 pixels a time - RESOURCE: Numerical Methods that work
- RESOURCE: What Every Computer Scientist Should Know About Floating-Point Arithmetic
- Define
END_TIMED_BLOCK_COUNTED
- SIMDify
DrawRectangleHopefullyQuickly
_mm_mul_ps
_mm_add_ps
_mm_sub_ps
_mm_sqrt_ps
_mm_max_ps
_mm_min_ps
- before simdify: about 135 cycles, after simdify, about 100 cycles
- Casey accidently showed a performance boost by just removing two inline functions
- Write pixels using SIMD
- Use structure art technique to verify that our unpack code is correct
- Be careful about alignment
- Used SIMD intrinsics
_mm_unpacklo_epi32
_mm_castps_si128
_mm_cvttps_epi32
_mm_or_si128
_mm_slli_epi32
_mm_storeu_si128
- Convert load code to SIMD except for texture fetching
- Used SIMD instrinsics
_mm_cvtps_epi32
will round to nearest by default_mm_cvtepi32_ps
_mm_andnot_si128
_mm_loadu_si128
_mm_srli_epi32
_mm_cmpge_ps
_mm_cmple_ps
_mm_movemask_epi8
- Change the way of calculating px
- Count intrinsics by substituting intrinsics with macros (I skipped this)
_mm_sqrt_ps
does not hurt us too badly, about 5 cycles
- Casey demostrates how to use IACA to profile our program
- We don't have to convert color to 0-1 first. We can do opertions in 0-255 space, this can save bunch of mul ops.
- Make
texture->memory
andtexture->pitch
local variables (This doesn't work for me) - Use
_mm_rsqrt_ps
instead of_mm_sqrt_ps
(This doesn't work for me either)
This was a really long day, 4 hours in total.
- The IACA analyser does not take loop into account, so we need to manually unroll the loop
_mm_setr_ps
: use memory order rather than register order_mm_mul_epi32
will produce 2 64bit value, we need to use_mm_mullo_epi32
- Define
rectangle2i
andUnion
,Intersect
functions- As the
rectangle2
,max
is not included, so we need to changeDrawRectangleQuickly
- As the
- Make
DrawRectangleQuickly
draw on even lines or odd lines DrawRectangleQuickly
takes a clip rect- Use
clipMask
to ensure drawing inside clip rect - Define
InvertedInfinityRectangle
- Define
TiledRenderGroupToOutput
- NOTE:
_mm_mullo_epi32
belongs to SSE4
This is a blackboard day, no code involved.
- Casey talks about process, thread, hyperthreading and all kinds of that stuff
- Casey demonstrates how to use
CreateThread
windows api CloseHandle
will not close the thread, it just releases the handle to the OS.
- X64 provides special instructions to help us write multithread programs
- interlocked compare and exchange
- Casey demonstrates the unsafe multithread code
- We need a way to tell the compiler and the processor not to reorder things
_WriteBarrier
for the compiler_mm_sfence()
for the processor
- Use
volatile
to tell the compiler some variable may be changed without its local knowledge - Use
InterlockedIncrement
to safely modify our variable - Use semaphores to implement a basic multithread work queue
CreateSemaphoreEx
WaitForSingleObjectEx
ReleaseSemaphore
- Build a single producer multiple consumer queue system
- data structure:
platform_work_queue
platform_work_queue_entry
platform_work_queue_callback
PlatformAddEntry
PlatformCompleteAllWork
Win32ProcessNextEntry
InterlockedCompareExchange
- Make it circular so we don't worry about wrapping
- data structure:
- We can use
getCurrentThreadId
to get current thread id. This is for testing. - Render with multithreading
- In x64, there is no need to use
_mm_sfence
because writes are always ordered. _mm_sfence
is only necessary when you are writing to someting like write combining memory which may reorder things for you- Assert that outputTarget's memory is aligned with 16 bytes in
TiledRenderGroupToOutput
- Modify
DrawRectangleQuickly
to support memory aligning- Introduce
startClipMask
andendClipMask
and set them correctly - Use
_mm_load_si128
and_mm_store_si128
instead of the unaligned version
- Introduce