-
Notifications
You must be signed in to change notification settings - Fork 0
/
awards.html
1412 lines (1340 loc) · 60.8 KB
/
awards.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<img width="350" style="float:right" src="pics/YGil2.png"/>
<TITLE>Yolanda Gil's Awards and Grants</TITLE>
<H2> <A HREF="https://knowledgecaptureanddiscovery.github.io/yolanda_gil_website">Yolanda Gil</A> (PI), Awards and Grants</H2>
<UL>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
Artificial Intelligence and Community Driven Wildland Fire Innovation
via a WIFIRE Commons Infrastructure for Data and Model Sharing (Phase II).</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Award number OIA-2134904.
September 2021 - August 2023.
<BR>
Ilkay Altintas (PI), Yolanda Gil (Co-PI), John Hiers (Co-PI), and Rodman Linn (Co-PI).
<BR>
<i>
This project is creating an artificial intelligence (AI) enabled approach to
support the design and management of controlled fires that aid in
wildfire preparedness and the prevention of megafires. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
Towards Reflection Competencies for AI Scientists: Developing a Conceptual
Framework and Open Research Platform.</A>
</font></b><BR>
<a href="https://www.onr.navy.mil/">
Office of Naval Research (ONR)</a>.
<BR>
Award ID N00014-21-1-2437.
June 2021 - May 2023.
<BR>
Yolanda Gil (PI).
<BR>
<i>
As scientific questions become more complex, the capabilities of scientists to do
research will need to be augmented with AI systems.
This project will develop an open architecture for cognitive AI scientists
that can formulate scientific questions, devise strategies to answer them,
and place new findings in the context of the original question.
A core aspects of this research is capturing scientific knowledge to reason
about open questions, construct plausible hypotheses, formulate appropriate
methods to test them, and interpret the results obtained. The proposed project
will be carried out in two phases. In the first phase, we will develop the
conceptual framework and prototype its core competencies and demonstrate it
in two scientific disciplines. In the second phase, we will exercise the
conceptual framework in new domains, and extend it with capabilities to
automatically write sections of scientific articles describing approach
and method in selected domains. This research will advances many areas of AI,
including cognitive architectures, knowledge representation, reasoning,
planning, learning, explanation, and metareasoning.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
Artificial Intelligence and Community Driven Wildland Fire Innovation
via a WIFIRE Commons Infrastructure for Data and Model Sharing.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Award number OIA-2040676.
September 2020 - May 2022.
<BR>
Ilkay Altintas (PI), Yolanda Gil (Co-PI), John Hiers (Co-PI), and Rodman Linn (Co-PI).
<BR>
<i>
This project is creating a data-driven, artificial intelligence (AI) enabled and
model-based scientific approach that ultimately aims to limit and even prevent the
devastating effects of wildfires by using advanced technologies to support
fire mitigation, preparedness, response, and recovery. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
Automating Machine Learning for Time Series Analysis.</A>
</font></b><BR>
JP Morgan Chase.
<BR>
March 2019 - February 2022.
<BR>
Yolanda Gil (PI), Deborah Khider (Co-PI).
<BR>
<i>
The goal of this project is to automate time series analysis. This would enable
non-experts to analyze time series data with high-quality, proven methods, and
would also allow efficient analysis of the vasts amount of timeseries data available.
We are developing an automated system for time series analysis. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
High Resolution Mapping of the Genetic Risk for Disease in the Aging Brain.</A>
</font></b><BR>
<a href="http://www.nih.gov">
National Institutes of Health (NIH)</a>.
<BR>
Award number 1R01AG059874-01.
August 2018 - November 2023.
<BR>
Neda Jahanshad (PI), Yolanda Gil (Co-PI).
<BR>
<i>
This project is developing a develop a discovery engine, powered with intelligent workflows,
that will continually process neuroscience data. The engine will autonomously trigger
the execution of relevant families of workflows, customize them to the data at hand,
and alert users of interesting findings. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://mint-project.info">
MINT: Model INTegration Through Knowledge-Rich Data and Process Composition.</A>
</font></b><BR>
<a href="http://www.darpa.gov">
Defense Advanced Research Projects Agency (DARPA)</a>.
<BR>
Award number W911NF-18-1-0027.
December 2017 - November 2021.
<BR>
Yolanda Gil (PI), Ewa Deelman (Co-PI), Craig Knoblock (Co-PI), Rafael Ferreira (Co-PI),
Kelly Cobourn (Co-PI), Christopher Duffy (Co-PI), Vipin Kumar (Co-PI),
Scott D. Peckham (Co-PI).
<BR>
<i>
Major societal and environmental challenges require forecasting how natural processes
and human activities affect one another. There are many areas of the globe where climate
affects water resources and therefore food availability, with major economic and social
implications. Today, such analyses require significant effort to integrate highly
heterogeneous models from separate disciplines, including geosciences, agriculture,
economics, and social sciences. Model integration requires resolving semantic,
spatio-temporal, and execution mismatches, which are largely done by hand today
and may take more than two years. This project will develop a new approach to use a wide
range of semantics in modeling environments in order to significantly reduce the time
needed to develop new integrated models while ensuring their utility and accuracy. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
EarthCube Integration: ASSET: Accelerating Scientific Workflows using EarthCube Technologies.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Award number ICER-1740683.
September 2017 - August 2019.
<BR>
Scott D. Peckham (PI), Co-PIs: Yolanda Gil (co-PI), Cindy Bruyere (co-PI), Michael D.
Daniels (co-PI), James Done (co-PI).<BR>
<i>
This project is developing a framework to capture scientific workflows in different
domains of geosciences, characterize the type of work and the amount of effort involved
in each activity, and map workflow activities to available tools and infrastructure. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
EarthCube Data Infrastructure: A unified experimental-natural digital data system for
analysis of rock microstructures.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Award number ICER-1639716.
September 2017 - August 2020.
<BR>
Julie Neuman (PI), Yolanda Gil (co-PI), J. Douglas Walker (co-PI), Philip Skemer (co-PI),
Matty Mookerjee (co-PI), Gurman Gill (co-PI), Chris J. Marone (co-PI), and Basil
Tikoff (co-PI).
<BR>
<i>
This project is developing semantic workflows to analyze geology data about
rock features and microstructures using image analysis and machine learning. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.organicdatascience.org/">
Crowdsourcing metadata for the ENIGMA neuroscience collaboration.</A>
</font></b><BR>
<a href="www.kavlifoundation.org/">
The Kavli Foundation</a>.
<BR>
April 2017 - April 2019.
<BR>
Paul Thompson (PI), Neda Jahanshad (co-PI), Yolanda Gil (Co-PI).
<BR>
<i>
This project is extending the Organic Data Science framework to use semantics
and metadata to manage data and other information about the ENIGMA neuroscience
collaboration. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
DSBox: Data Scientist in a Box.</A>
</font></b><BR>
<a href="http://www.darpa.gov">
Defense Advanced Research Projects Agency (DARPA)</a>.
<BR>
Award number FA8750-17-C-0106.
May 2017 - May 2021.
<BR>
Pedro Szekely (PI), Yolanda Gil (Co-PI), Aram Galstyan (Co-PI), Andrew McCallum (co-PI),
Steve Minton (co-PI).
<BR>
<i>
This project is developing an intelligent system that incorporates significant expertise
in data science and machine learning in order to: 1) automatically generate data analysis
workflows that include feature generation, feature selection, data cleaning, and machine
learning steps for any kind of input data including tabular, text, image, audio data and
their combinations; 2) extensible libraries of data processing and machine learning
components with semantic metadata that can be used to compose valid workflows; and
3) interactive generation of multi-step data science solutions that incorporate user
constraints and expertise in the domain. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.is-geo.org">
Intelligent Systems Research to Support Geosciences: A Research Coordination Network.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Award number ICER-1632211. September 2016 - August 2018.
<BR>
Suzanne Pierce (PI), Imme Ebert-Uphoff (Co-PI), Yolanda Gil (Co-PI), Basil Tikoff (Co-PI).
<BR>
<i>
This project will support an emerging community of interdisciplinary researchers
to enable advances in our understanding of Earth systems through
innovative applications of intelligent and information systems to fundamental
geosciences problems. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
Model Integration for Big Mechanism.</A>
</font></b><BR>
<a href="http://www.darpa.gov">
Defense Advanced Research Projects Agency (DARPA)</a>.
<BR>
Award number: W911NF-14-1-0364.
June 2016 - October 2016.
<BR>
Yolanda Gil (PI).
<BR>
<i>
This project investigates the integration of models across disciplines through semantic
techniques to support flexible model integration and simulation at large scale.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
A Discovery Engine for Reproducible Comparable Multi-Omics Analysis.</A>
</font></b><BR>
<a href="http://www.nih.gov">
National Institutes of Health (NIH)</a>.
<BR>
Award number 1R01GM117097-01.
February 2016 - January 2019.
<BR>
Parag Mallick (PI), Yolanda Gil (Co-PI).
<BR>
<i>
This project is developing a develop an open-source workflow platform
to enable the generation and effective use of multi-omic workflows.
We are using the WINGS semantic workflow reasoner to significantly automate development
and validation of workflows. A discovery engine will autonomously trigger the execution of
families of workflows and alert users of interesting findings. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
Towards Automating Discovery with DISK: Systematic Data Analysis of Science Repositories.</A>
</font></b><BR>
<a href="http://www.darpa.gov">
Defense Advanced Research Projects Agency (DARPA)</a>.
<BR>
Award number W911NF-15-1-0555.
September 2015 - September 2017.
<BR>
Yolanda Gil (PI), Parag Mallick (Co-PI).
<BR>
<i>
This project is developing a scientific discovery system (DISK) that captures data
analytics expertise, applies it automatically and routinely to scientific data repositories,
and highlights interesting findings and potential discoveries. We are investigating three
fundamental questions: 1) Can we identify domain-independent computational processes and
use them to make autonomous hypothesis-driven discoveries from existing datasets?
2) Can we effectively capture domain-specific knowledge needed to apply those processes
in a new science domain? 3) Can these processes be automatically combined in novel ways
and enable new kinds of discoveries? Our approach is to automate the hypothesize-test-evaluate
discovery cycle with an intelligent system that a scientist can task with lines of inquiry
over existing data repositories. The system will then autonomously test hypotheses by
running analytic workflows on the data and examining the results to report new findings
back to the scientist. DISK extends the existing
WINGS semantic workflow
system. We are initially applying DISK in multi-omics and subsurface water modeling.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
EarthCube Integrated Activities: LinkedEarth: Crowdsourcing Data Curation and Standards
Development in Paleoclimatology.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number ICER-1541029.
September 2015 - August 2018.
<BR>
Julien Emile-Geay (PI), Yolanda Gil (Co-PI), Nicholas McKay (Co-PI).
<BR>
<i>
Paleoclimatology datasets are key to understanding low-frequency, natural climate
variability that significantly modulates anthropogenic global warming. However,
there is currently no universal way to share paleoclimate data between users or
machines, hindering integration and synthesis. The majority of observations are
gathered by independent scientists with no formal language for describing their
data and metadata to each other, or to machines, in a standardized fashion. This is
further aggravated by the diversity of data (e.g. trees, ice cores, lake or marine
sediments, corals, mollusks, speleothems), each having very different characteristics,
and the diversity of measured quantities (e.g. trace metal concentrations, isotope ratios,
layer thickness, etc.). This data diversity is typical in other sciences, particularly
in ecology. Managing and integrating this kind of scientific data is challenging because:
(1) metadata creation and data curation requires expert knowledge; (2) top-down data
management approaches do not tend to be effective; (3) existing infrastructure does
not foster standardization. Therefore, there is a critical need for a flexible platform
enabling crowdsourced data curation and standards development through community
participation. In this project we are investigating a socio-technical system that has
the potential to engage a broad user base in geoscientific data curation. Our approach
is based on semantic wikis that incorporate editorial and community-driven processes
that follow principles from social sciences research on successful on-line communities.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
EarthCube Integrated Activities: InGeO: Integrated Geosciences Observatory.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number ICER-1540937.
September 2015 - August 2017.
<BR>
Asti Bhatt (PI), Yolanda Gil (Co-PI), Russell Cosgrove (Co-PI).
<BR>
<i>
The Integrated Geoscience Observatory (InGeO) is a pilot project for geospace research
that facilitates the integration of resources from different disciplines that study
the Sun-Earth system. The observatory creates an integrated package of software
tools contributed by researchers with specific capabilities, and designed to enable
integration of diverse observational data. Features of the toolkit include:
(1) linking diverse datasets from multiple data repositories and automatically
mapping them to a common user-specified coordinate grid; (2) implementing the
well-known Assimilative Mapping of Ionospheric Electrodynamics (AMIE) procedure
for assimilation of this data to yield a global picture; and (3) using the OntoSoft
registry for sharing analytic software and GeoDataspace for credit attribution of
processed data. </i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
SciSpark: Provenance Recording in the Spark Framework for the Regional Climate Model Evaluation System.</A>
</font></b><BR>
<a href="http://www.nasa.gov">
National Aeronautics and Space Administration (NASA)</a>.
<BR>
Grant number 14-AIST-14-0034.
September 2015 - February 2017.
<BR>
Chris Mattmann (PI), Yolanda Gil (Co-PI), Jinwon Kim (Co-PI).
<BR>
<i>
SciSpark is a framework for scaling scientific computations that extends Apache Spark
with new capabilities for processing science data standards for large-scale data,
with a particular focus on climate data. Remote sensing data and climate model
output are multi-dimensional arrays of massive sizes locked away in heterogeneous
file formats (HDF5/4, NetCDF 3/4) and metadata models (HDF-EOS, CF) making it
difficult to perform multi-stage, iterative science processing since each stage
requires writing and reading data to and from disk. Apache Spark implements the
MapReduce paradigm for parallel computing while emphasizing in-memory computation,
and so outperforms the disk-based MapReduce implementation in Apache Hadoop by 100x
in memory and by 10x on disk. SciSpark will enable scalable model evaluation by
executing large-scale comparisons of A-Train satellite observations to model grids
on a cluster of 100 to 1000 compute nodes. This 2nd generation capability for NASA's
Regional Climate Model Evaluation System (RCMES) will compute simple climate metrics
at interactive speeds, and extend to quite sophisticated iterative algorithms such as
machine-learning (ML) based clustering of temperature PDFs, and even graph-based
algorithms for searching for Mesocale Convective Complexes. The goals of SciSpark are to:
(1) Decrease the time to compute comparison statistics and plots from minutes to seconds;
(2) Allow for interactive exploration of time-series properties over seasons and years;
(3) Decrease the time for satellite data ingestion into RCMES to hours; (4) Allow for
Level-2 comparisons with higher-order statistics or PDF's in minutes to hours; and
(5) Move RCMES into a near real time decision-making platform.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.is-geo.org">
Intelligent Systems for Geosciences.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number IIS-1533930.
September 2014 - August 2016.
<BR>
Yolanda Gil (PI), Suzanne Pierce (Co-PI).
<BR>
<i>
In recent years, intelligent systems have demonstrated significant transformative impact
in the commercial sector. These techniques have been applied to geosciences with some
success, but they are inadequate to meet the challenges presented by geosciences research.
First, using data alone is insufficient to create models of the complex phenomena under
study. Second, geoscientists need to reach across disciplines to synthesize disparate
data and models, which requires extensive qualification and context. Third, scientists
need powerful partnerships with computers in order to explore complex hypotheses and
understand how new findings relate to the existing body of knowledge. Therefore, in order
to tackle complex geosciences phenomena new approaches are needed. The goal of this
project is to produce a report outlining a research agenda for intelligent systems in
geosciences, and their potential to overcome the challenges that geoscientists face.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
EarthCube Research Coordination Networks: iSamplES: the Internet of Samples in the Earth Sciences.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number ICER-1440351.
September 2014 - August 2017.
<BR>
ISI PI: Yolanda Gil.
<BR>
<i>
The Internet of SamplES in the Earth Sciences (iSamples) is a research coordination
network that seeks to advance the use of innovative cyberinfrastructure to connect
physical samples and sample collections across the Earth Sciences with digital
data infrastructures to revolutionize their utility for science. The ultimate goal
of iSamplES is to dramatically improve the discovery, access, sharing, analysis, and
curation of physical samples and the data generated by their study for the benefit of
science and society. As part of this project, we are developing a registry of sample
repositories, together with metadata to describe their holdings, operations, and curation
procedures.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap/">
Accelerating Map of the World.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Geospatial-Intelligence Agency (NGA)</a>.
<BR>
January 2015 - September 2015.
<BR>
ISI PI: Yolanda Gil.
<BR>
<i>
Geographic data is increasingly being shared, interchanged and used for purposes other
than the producers� intended purpose. In addition to traditional institutional sources
of information, recent crowdsourcing approaches have proven to be useful additions to
traditional geospatial data sources. Yet the timeliness and currency advantages of
crowdsourcing bring additional burdens regarding quality and fitness-for-use.
Dynamically integrated datasets created through Web mashups are constantly appearing,
resulting in a variety of new geospatial resources that deliver customized information.
Producers expect that their data will have unanticipated uses, while consumers have
limited insight of into the integrity of producers and data and whether they can be trusted.
Therefore, information about the quality of available geographic data is vital to the
process of selecting information in that the value of data to a consumer is directly
related to its quality. This project investigated a data quality ontology for
geospatial data, and its mapping to the Content Maturity Model proposed by NGA.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://geosoft.earthcube.org">
EarthCube Building Blocks: Collaborative Open Source Software Sharing for the Geosciences.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number ICER-1440323.
September 2014 - August 2018.
<BR>
Yolanda Gil (PI), Christopher Duffy (co-PI), Chris Mattmann (co-PI), Scott Peckham (co-PI), Erin Robinson (co-PI).
<BR>
<i>
Geosciences software embodies crucial scientific knowledge, and as such it should be explicitly
captured, curated, managed, and disseminated. The goal of this project is to create a system
for software stewardship in geosciences that will empower scientists to manage their software
as valuable scientific assets. Scientific software stewardship requires a combination of
cyberinfrastructure, social infrastructure, and professional development infrastructure.
The framework will result in an open transparent and broader access to scientific software
to other scientists, software professionals, students, and decision makers. It will
significantly improve the adoption of open data and open software initiatives, improve
reproducibility, and advance scientific scholarship.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.organicdatascience.org">
The Age of Water and Carbon in Hydroecological Systems: A New Paradigm for Science Innovation and Collaboration through Organic Team Science.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number IIS-1344272.
October 2013 - September 2017.
<BR>
Yolanda Gil (PI), Christopher Duffy (co-PI), Paul Hanson (co-PI).
<BR>
<i>
The project will develop a new socio-technical framework
for "organic team science" in which
scientists are motivated to collaborate across diverse scientific communities
and to share and normalize data to solve scientific problems through an open
framework.
This project will develop new scientific work practices and associated
cyberinfrastructure to advance the fields of hydrology and limnology (lake ecology).
The project will advance hydrology by making already-collected
geospatial data more usable for analysis and simulations. It will advance limnology
by developing an integrated hydrodynamic model of lakes as connected to the
broader hydrologic network to quantify water, material, nutrient and energy fluxes,
which is potentially transformative for limnology. It will also advance
socio-technical research in the context of distributed scientific collaboration.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
Learning Big Data Analytic Skills through Scientific Workflows.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number ACI-1355475.
September 2013 - August 2017.
<BR>
Yolanda Gil (PI).
<BR>
<i>
Big data analytics has emerged as a widely desirable skill in many areas. Although
courses are now available on a variety of aspects of big data, there is a lack of a
broad and accessible course that covers the variety of topics that concern big data analytics.
As a result, acquiring practical data analytics skills is out of reach
for many students and professionals, posing severe limitations to our ability as a
society to take advantage of our vast digital data resources. The goal of this
work is to develop curriculum materials for big data analytics to provide broad
and practical training in data analytics in the context of real-world and
science-grade datasets and data analytics methods. A key technical basis of the
approach is the use of workflows that capture expert analytic methods that will
be presented to users for practice with real-world datasets within pre-defined
lesson units. The results of this work include lesson units for learning
expert-level skills in big data analytics, a framework for non-programmers to
understand basic concepts in big data analytics, and a hands-on workflow
framework to learn by direct experimentation and exploration with scientific data.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
Provenance Tracking for Integrated Geospatial Data.</A>
</font></b><BR>
<a href="http://www.ogc.org">
Open Geospatial Consortium (OGC)</a>.
<BR>
September 2013 - August 2014.
<BR>
Yolanda Gil (PI).
<BR>
<i>
Tracking the provenance of geospatial information is important in order to
understand how to trust and use the information based on what sources generated it
and the processes used to integrate it. This project will analyze the use of the
recent W3C PROV standard in the context of geospatial information integration,
particularly to study scalability, granularity, and presentation of provenance.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://geosoft.earthcube.org">
EarthCube Building Blocks: Software Stewardship for the Geosciences.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number ICER-1343800.
September 2013 - February 2016.
<BR>
Yolanda Gil (PI), Christopher Duffy (co-PI), Chris Mattmann (co-PI), Scott Peckham (co-PI), Erin Robinson (co-PI).
<BR>
<i>
Geoscience and environmental science software is crucial for data analysis to
generate new knowledge and understanding about the Earth. Because reproducibility
of operations, calculations, and predictions done with this software is important
for science, commercial, and regulatory applications, it is important that the
software generated by geoscientists and their colleagues be captured, curated,
managed, and shared. The GeoSoft project brings together computer scientists,
geoscientists, and social scientists to assist scientists to describe basic characteristics of their code and share it. GeoSoft will be a social site
where scientists can discover alternative approaches to release free software,
use intelligent interfaces to explain how their software works, and form
productive communities around software projects. This research has the
potential to fundamentally transform geosciences by making scientific software
readily available to researchers and citizen scientists for efficient data
analysis.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
A Scalable Open Source Platform for Data Processing, Archiving and Dissemination.</A>
</font></b><BR>
<a href="http://www.darpa.mil">
Defense Advanced Research Projects Agency (DARPA)</a>.
<BR>
Grant number FA8750-13-C-0016, subcontract to MDA.
June 2011 - March 2015.
<BR>
Yolanda Gil (co-PI), Chris Mattmann (co-PI), Sam Park (PI).
<BR>
<i>
End users have a lot of data, but do not have the expertise needed to analyze it.
The goal of this project is to empower end users to analyze big data by demonstrating
that: 1) data analytics experts can use open source software to quickly assemble
workflows, 2) end users can easily run these expert-grade workflows and get useful
views on their data. Our work combines semantic workflow capabilities of the WINGS
workflow system with scalable data systems and workflow execution infrastructure
available in the OODT framework. Our work includes a release of the integrated
system as open source software within the Apache OODT project.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
Workflows for the Regional Climate Model Evaluation System (RCMES).</A>
</font></b><BR>
<a href="http://www.darpa.mil">
National Aeronautics and Space Administration (NASA)</a>.
<BR>
Subcontract to JPL.
January 2013 - June 2013.
<BR>
ISI PI: Yolanda Gil. PI: Chris Mattmann.
<BR>
<i>
The Regional Climate Model Evaluation System (RCMES) enables climate scientists to compare and evaluate regional climate data. The system supports easy access to shared databases such as NASA data sources, and enables scientists to compare climate model predictions with those observations. The goal of this project is to extend RCMES to include workflows for climate model comparison and evaluation, tracking provenance and enabling reproducibility.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
An Analytical Framework for Provenance-Rich Social Knowledge Collection.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number IIS-1117281.
September 2011 - August 2014.
<BR>
Yolanda Gil (PI).
<BR>
<i>
This project will investigate a new generation of provenance-rich social knowledge collection
systems that will greatly improve the ability of people to create online communities of interest
and share information. The research will transform the state of the art in social content
collection in several important ways. First, social knowledge collection systems will be
augmented to support contributors to structure factual content, so that information can be
aggregated to answer reasonably interesting albeit simple factual queries. We will build on
a semantic wiki framework to allow users to create structured factual content as
object-property-value triples. It will not assume pre-defined ontologies, but rather
develop algorithms that analyze current content and suggest opportunities for structuring
contributions so they can be aggregated to answer simple queries. Second, they will include
detailed provenance records that reflect how the content was created, allowing contributors
to enter alternative viewpoints and enabling consumers to make quality and trust judgments.
The research will include developing algorithms that derive trust metrics from the provenance
records, and to allow users to define views on the content based on provenance criteria.
It will create novel approaches to propagate trust across content topics and categories
and complement existing algorithms that propagate trust in social networks. Third, the
systems will proactively guide contributors to invest effort where it is most needed,
developing novel algorithms to detect knowledge gaps, and by allowing users to define
queries that will be used to drive further contributions.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="https://sites.google.com/site/earthcubeworkflow/">
An EarthCube Community Group and Roadmap for Workflows for Geosciences.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number EAR-1238216.
April 2012 - March 2013.
<BR>
Yolanda Gil (PI), Aaron Braekel (co-PI), Ewa Deelman (co-PI), Ibrahim Demir (co-PI),
Christopher J. Duffy (co-PI), Suresh Marru (co-PI), Marlon Pierce (co-PI).
<BR>
<i>
The goal of this project is to elicit requirements for workflows in geosciences, assess the state of the art and current practices, identify current gaps in both the use of and capabilities of current workflow systems in the earth sciences through use case studies, and identify grand challenges for the next decade along with the possible paths to addressing those challenges. This effort is part of the NSF EarthCube initiative.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
Discovery Informatics.</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number IIS-1151951.
September 2011 - August 2012.
<BR>
Yolanda Gil (PI).
<BR>
<i>
In order to address the ambitious research agenda put forward by many science disciplines,
many challenges must be addressed in the areas of information sciences, intelligent systems,
and human-computer interaction. Data modeling and integration still require large investments
of scientist time and effort. The scientific literature grows so quickly in many areas that
it becomes unmanageable for scientists. Many aspects of the scientific discovery process are
often largely manual and could be automated, improved, or made more efficient. Better
interfaces for collaboration, visualization, and understanding would significantly improve
scientific practice. The goal of this project is to produce a report outlining the
opportunities that scientific discoveries present to information sciences and intelligent
systems as a new area of research called discovery informatics.</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
Workflow-Net: Cybersecurity through Nimble Task Allocation:
Workflow Reasoning for Mission-Centered Network Models.</A>
</font></b><BR>
<a href="http://afosr.sciencewise.com">
Air Force Office of Scientific Research (AFOSR)</a>.
<BR>
Grant number FA9550-11-1-0104.
June 2011 - March 2015.
<BR>
Yolanda Gil (PI).
<BR>
<i>
Traditional cybersecurity has focused on techniques to analyze and eliminate vulnerabilities
in a network, often in response to actual security breaches of previously unknown weaknesses.
Recognizing that in practice network operations can never be fully secure, a major focus of
recent research is on intrusions that are assumed to be on-going in the network by one or
more malicious parties. In this new view on cybersecurity, a key desired capability is to
be able to accomplish a mission even while the network is compromised and subject to
deception. However, traditional network models lack a representation of the mission and
of how network resources are utilized to accomplish various aspects of the mission.
In this project, we will investigate a new approach to develop a general framework for
representing models of mission goals and tasks, and to exploit those models to make a
mission more robust to deception operations co-occurring in the network. These
mission-centered network models (MCNMs) will build on and extend current two-layered
(logical/physical) network models by integrating a new layer of task-level representations
of the mission into those models. In this new task-oriented layer, a mission can be
characterized as a set of goals, each accomplished by a set of interdependent tasks
that place requirements on the network resources. The system can then dynamically
control the mappings of those tasks onto network resources using a variety of algorithms
that take into account which resources are currently compromised. As a result, a mission
can be protected from ongoing intrusion and deception activities by dynamically reallocating
resources as they become compromised and by examining provenance records of task outcomes to
determine their reliance on compromised resources. MCNMs can be used to determine which
resources are critical for any given mission, to prioritize the use of uncompromised resources,
to accomplish and estimate the trust on mission tasks when resources are compromised, and to
determine the practical impact on the mission of deception activities. MCNMs will enable a
new approach to cybersecurity in network-based operations.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://workflow-sharing.isi.edu">
W-SHARING:
Towards Shared Repositories of Computational Workflows</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number IIS-0948429.
September 2009 - August 2011.
<BR>
Yolanda Gil (PI).
<BR>
<i>
Scientific computing has entered a new era of scale and sharing with
the arrival of cyberinfrastructure for computational experimentation.
A key emerging concept is scientific workflows, which provide a
declarative representation of scientific applications as complex
compositions of software components and the dataflow among them.
Workflow systems manage their execution in distributed resources,
track provenance of analysis products, and enable rapid
reproducibility of results. In current cyberinfrastructure, there are
well understood mechanisms for sharing data, instruments, and
computing resources. This is not the case for sharing workflows,
though there is an emerging movement for sharing analysis processes in
the scientific community. In this grant, we are investigating
computational mechanisms for sharing workflows as a key missing
element of cyberinfrastructure for scientific research. We are
exploring three major research topics. First, we are eliciting new
requirements that workflow sharing poses over current techniques to
share software tools and libraries. Second, we want to understand how
shared workflow catalogs should be designed. Existing shared data
catalogs are a successful model, but software artifacts require
different representations and access functions. Finally, we are
studying what sharing paradigms might be appropriate for scientific
communities, exploring environments ranging from traditional
server-based architectures to wikis to Web 2.0 social sites.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/division3/discourse">
PedWorkflow: Workflows for Assessing Student Learning</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number IIS-0917328.
September 2009 - August 2011.
<BR>
Jihie Kim (PI), Gisele Ragusa (co-PI), Erin Shaw (co-PI), and Yolanda Gil (co-PI).
<BR>
<i>
As on-line learning becomes more popular and is increasingly
integrated in engineering courses, instructors
become overwhelmed with the amount of information
that they have to process.
For example,
discussion boards support collaborative interaction and reflective
problem solving, but instructors need to monitor the student
discussions
in order to adress questions and
corrections as well as for grading student participation.
The goal of this project is to create a novel workflow environment
to support efficient assessment of student learning through the design
and composition of assessment workflows. The workflows will support
data analysis and will be re-usable across curricula and instructors.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="https://knowledgecaptureanddiscovery.github.io/yolanda_gil_website">
Designing Scientific Software One Workflow at a Time
</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number CCF-0725332.
October 2007 - September 2011.
<BR>
Ewa Deelman (PI) and Yolanda Gil (co-PI).
<BR>
<i>
Much of science today relies on software to make new discoveries.
This software embodies scientific analyses that are frequently
composed of several application components and created collaboratively
by different researchers. Computational workflows have recently emerged as
a paradigm to manage these large-scale and large-scope scientific analyses.
Workflows represent computations that are often executed in geographically
distributed settings, their interdependencies, their requirements and their
data products. The design of these workflows is at the core of today's
scientific discovery processes and must be treated as scientific products
in their own right. The focus of this research is to develop the foundations
for a science of design of scientific processes embodied in the new artifact
that is the computational workflow. The work will integrate best practices
and lessons learned in existing workflow applications, and extend them in
order to define and formalize design principles of computational workflows.
This work will result in a fundamentally new approach to designing workflows
that will greatly improve the scientific software design methodology by
defining and formalizing design principles, and by familiarizing the
scientific community with these effective workflow design processes.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="https://knowledgecaptureanddiscovery.github.io/yolanda_gil_website">
Plato: Phased-Learning through Analyzing Teaching and Observation
</A>
</font></b><BR>
<a href="http://www.darpa.mil">
DARPA Bootstrapped Learning (BL) program</a>.
<BR>
Grant number HR0011-07-C-0060, subcontract to SRI International.
August 2007-July 2011.
<BR>
ISI co-PIs: Paul Cohen and Yolanda Gil.
<BR>
<i>
The goal of this project is to develop an electronic student
that can learn from a teacher using different methods of natural instruction.
We will contribute the strategies to learn from being told by the teacher
a broad range of generalities about
process knowledge. These general descriptions will be tested by the learner with examples
and practice of those processes. We will use Interdependency Models to relate the individual
teacher instructions, check the consistency with the student's prior knowledge,
and detect gaps in the stated instruction that could be filled through practice.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
Windward: Scalable Knowledge Discovery Through Grid Workflows</A>
</font></b><BR>
<a href="http://www.afrl.af.mil">
Air Force Research Laboratory (AFRL)</a>.
<BR>
Grant number FA8750-06-C-0210.
September 2006 - December 2008.
<BR>
Yolanda Gil (PI). ISI co-PIs: Paul Cohen and Ewa Deelman.
<BR>
<i>
Distributed workflows are emerging as a key technology to conduct large-scale
and large-scope scientific applications in earthquake science, physics,
astronomy, and many other sciences. In this new project, we will investigate
the use of workflow technologies for Artificial Intelligence applications with
a particular focus on data analysis and knowledge discovery tasks. Based on
the data to be analyzed, an initial workflow template is formed by selecting
from a library of known-to-work compositions of general-purpose machine
learning algorithms. The workflow template is specialized through knowledge-based
selection and configuration of algorithms. Finally, the workflow is mapped to
available resources and restructured to improve execution time. Data analysis
and knowledge discovery applications will benefit from the automation, scale,
and distributed data and resource integration supported by distributed workflow
systems. We will also conduct new research in important aspects of workflow
systems. To what extend can we represent complex algorithms and their subtle
differences so that they can be automatically selected and configured to
satisfy the stated application requirements? Can we develop learning
techniques that improve the performance of the workflow system by exploiting
an episodic memory of prior workflow executions? What mechanisms will be
needed to support autonomous and robust execution of concurrent workflows
over continuously changing data?.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/nsf-workflows06">
Challenges of Scientific Workflows
</A>
</font></b><BR>
<a href="http://www.nsf.gov">
National Science Foundation (NSF)</a>.
<BR>
Grant number IIS-0629361.
May 2006 - October 2007.
<BR>
Yolanda Gil (PI) and Ewa Deelman (co-PI).
<BR>
<i>
In recent years, workflows have emerged as a paradigm for conducting large-scale
scientific analyses. The structure of a workflow specifies what analysis
routines need to be executed, the data flow amongst them, and relevant
execution details. Workflows provide a systematic way to capture scientific
methodology and provide provenance information for their results. Robust and
flexible workflow creation, mapping, and execution are largely open research
problems. The aim of this project was to bring
together IT researchers and practitioners working on a variety of aspects
of workflow management as well as domain scientists that use workflows for
day-to-day data analysis and simulation. The project will produce
a final report with recommendations to the community regarding the
challenges of scientific workflows and their role in cyber infrastructure
planning for 21st century science and engineering research and education.
</i>
<LI>
<font color="#ED181E"><b>
<A HREF="http://www.isi.edu/ikcap">
C4ML: Metareasoning for Integrated Learning</A>
</font></b><BR>
<a href="http://www.darpa.mil">
DARPA Integrated Learning (IL) program</a>.
<BR>
Grant number FA8650-06-C-7606, subcontract to BBN Technologies.
May 2006 - July 2008.
<BR>
ISI co-PIs: Paul Cohen and Yolanda Gil.
<BR>
<i>
In this project we will develop a learning metareasoner to coordinate
the activities of many learners in an integrated system that learns
procedural knowledge from user demonstrations and past knowledge.
A learning metareasoner is a problem solver that has explicit representations
of its current learning state, learning goals, and has metareasoning methods to
accomplish those goals. The learning metareasoner will assess its progress
based on four criteria: capability, confidence, coverage, and competence (C4).
</i>