forked from UO-OACISS/apex
-
Notifications
You must be signed in to change notification settings - Fork 0
/
todo.txt
166 lines (160 loc) · 6.48 KB
/
todo.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
ToDo list, as of April 10, 2015:
--------------------------------
- collect idle rate, queue length statistics
- hpx3 (Nick)
- Hpx5 (kevin)
- clean up MPI implementation
- Refactor the communication API
- Refactor the instrumentation API (consistently handle name/address?) (done: April 15, 2015)
- Get more application cases for HPX5
- miniAMR
- Find multi-objective optimization case
- Find multi-input optimization case (1D stencil?) (done: April 13, 2015)
- # threads (done: April 13, 2015)
- decomposition granularity (done: April 13, 2015)
- Use Cases:
- Changing queue order from FIFO -> LIFO
- Triggering repartition (1D stencil) (done: April 13, 2015)
- Triggering AMR
- Triggering load balance
- Triggering algorithm change
- Triggering device change (GPU, Phi example)
- Throttling for memory (monitor headroom?)
- Make APEX 0.1 release.
- Refine OMPT support
- make 1D stencil task example? (done: April 13, 2015)
- Polychronopolous OpenMP SC paper
- Change OpenMP runtime options at runtime with policy?
- Make event generation asynchronous? - i.e. return to worker as fast as possible
- validate PAPI support
- segregate examples and tests
- validate test output
- Verify TAU support
- HPX-3
- Trace
- HPX-5
- Profile
- Trace
- set up buildbot to test all configuration combinations?
- Active Harmony
- TAU
- PAPI
- MPI
- OMPT
- APEX should do a global reduction before dumping the profile to the screen.
Currently only the master process data is dumped.
- Related: the reduction feature should take a list of profiles, not just one.
The list can be just one, of course.
- Step 1: worker nodes write number of timers/counters to memory location
- Step 2: root node gets number of timers/counters from each node
- Step 3: worker nodes write profiles to memory location (strings! how!)
- Step 4: root node gets profiles/counters from each node
- Modify the remote "steal" as a policy?
- Experiments without APEX:
- Run application in parametric mode, with variable numbers of HPX threads.
- libPXGL - sssp
- miniAMR
- Apex for MPI_T? Using MPI
- RIOS - should RIOS be providing power, temperature, network, disk, etc.
status information? That is, should we be going through RCR/RIOS to get these
things, rather than /proc?
- Get LULESH benchmark for Jesus (done, sort of - not public)
- ompss example?
- ARGO opportunities?
- Dynamic adaption to perturbation/variation?
- AMR example?
- APEX API refactoring. Needs to be done! (done, April 15, 2015)
- Items from meeting with LSU:
- Need to double check that APEX doesn't lock worker threads
(use HPX semaphore)
- Need to double-check thread synchronization in APEX in general.
(see above)
- Set up a build-bot? (buildbot)
- Idle rate is a good proxy for memory bandwidth saturation
- Partition size controls work granularity in 1D stencil.
- STAR is a good example for load balancing.
- profiler_listener should be HPX thread, and use HPX
semaphore (user level signaling) instead of OS level signaling.
- get_profile should return a const pointer to an apex_profiler object
- Add more self-validating tests to HPX-3, HPX-5
- Add apex_yield() function. (done - April 9, 2015)
- Move from googlecode to github (done - April 6, 2015)
- Change HPX-3 APEX integration from submodule to git cmake macro (done - April 9, 2015)
Completed ToDo list, as of April 10, 2015:
------------------------------------------
- email any/all details for APEX to Martin so we can think about PIPER integration.
- Write the main page documentation for APEX.
- Introduction
- Interface links
- Installation
- Library dependencies
- Environment Variables
- Usage Examples
- Acknowledgements
- Add lots of comments to code for Doxygen documentation
- Add doxygen generation
- instrument HPX3 runtime directly, without ITT interface.
- "reset counters" support - all or individual or a set. Maybe just a set? hard to do with C interface?
- By name
- By address
- integrate HPX5 runtime modifications into a separate library - make it a policy?
- clean up HPX5 implementation
- make separate APEX support library in HPX5
- update to new source code structure in HPX5
- make libapex_hpx (move from extras/apex)
- move throttling code to APEX repo for use in HPX3
- Get more application cases for HPX3
- miniGhost
- 1D stencil
- Get more application cases for HPX5
- LULESH
- libPXGL - sssp
- Use Cases:
- throttling for memory / power (HPX5 LULESH, SSSP examples)
- Changing number of threads dynamically for performance (Patricia Grubel example)
- Online monitoring
- Periodic power measurement on Edison (using RCR or direct) - reads it directly.
- Periodic /proc/stat data on Edison (using RCR or direct) - APEX reads it directly, when available.
- OMPT support
- Generic event handler interface
- With an event stack?
- Verify TAU support
- HPX-3
- Profile
- HPX-5
- Profile
- Change all void* types in APEX to apex_function or whatever named type is appropriate
- Test with valgrind to check for leaks and potential memory errors?
- OMPT example
- Build everything on Edison
- APEX - needs boost libraries added to package.
- HPX5
- HPX5 with APEX
- HPX3
- When apex_finalize happens, stop all periodic policies!
- Move the throttling code to the APEX library
- Experiments without APEX:
- Run application in parametric mode, with variable numbers of HPX threads.
- LULESH
Todo items (prior to April 10, 2015):
-------------------------------------
- integrate APEX into HPX (done)
- move source files into HPX source directory (done)
- update HPX CMake files to find APEX requirements (done)
- Make TAU optional (done)
- make TAU an event listener (done)
- Make event generation asynchronous (thinking)
- multiple generator, single consumer queue of events (done)
- fix apex::finalize problem (from multiple threads)
- re-integrate RCR (done)
- RAPL power measurements
- Memory controller saturation example
- design policy engine (done)
- implement policy engine (done)
Logistical Todo Items:
----------------------
- establish one-on-one communication with LSU for integration with HPX (done)
- establish one-on-one communication with RENCI for integration with RCR (done)
- establish one-on-one communication with Sandia for integration with RIOS
- migrate development to hermione@LSU
- move the server (done)