4.33.0 (2024-09-04)
- auth: add user api keys table (#4473) (7c1334d)
- incorporate schema for postgresql and add integration test (#4474) (cd64a99)
- onboarding: add bedrock (#4465) (b03901b)
- render db schema in welcome message when applicable (#4479) (ecdf039)
- Return Query from DeleteSystemApiKey mutation (#4432) (b0639e0)
- create nbconvert makefile (#4456) (f32f842)
- fix spacing (f959dc6)
- improve migration guide (#4425) (aed6c83)
- Update hosted Phoenix tutorials to use register and add quickstarts (#4371) (14b898b)
- update readme with a gif (15a8d0a)
- vision tracing tutorial (#4441) (669c1b7)
4.32.0 (2024-08-29)
- auth: delete system keys (#4426) (a6fa21e)
- auth: wire up login/logout (#4419) (5f60258)
- tools: Display tool schema definitions in the UI (#4428) (6aa787e)
- ui: add the ability to turn off auto-refresh of projects (#4414) (4a792d2)
- error message for PHOENIX_PORT env vars auto-generated by kubernetes (#4422) (63d0adb)
- scaffolder should incorporate port from cammand line (#4415) (0678c86)
- simplify readme (76db494)
4.31.0 (2024-08-28)
- ui: add the ability to turn off auto-refresh of projects (#4414) (4a792d2)
- vision: show images in a gallery, expandable images (#4407) (9e2d67f)
- annotation events should refresh trace project (#4412) (3a18c13)
- experiments: ensure compare experiments page does not break for experiments that contain a large number of examples (#4402) (71484e0)
- use dataloader for experiment run annotations (#4397) (5582ce6)
4.31.0 (2024-08-27)
4.30.2 (2024-08-27)
- experiments: ensure compare experiments page does not break for experiments that contain a large number of examples (#4402) (71484e0)
- use dataloader for experiment run annotations (#4397) (5582ce6)
4.30.1 (2024-08-26)
4.30.0 (2024-08-26)
4.29.0 (2024-08-26)
- Add Experiments API to OpenAPI Schema (#4356) (ca4fb5d)
- Delete system API key mutation (#4337) (b6bb6bc)
4.28.1 (2024-08-23)
4.28.0 (2024-08-23)
4.27.0 (2024-08-22)
- Add
list_experiments
client method (#4271) (a063d83) - Add fixtures only for new DBs, add flag to force fixture ingestion (#4315) (ef4adcd)
- auth: minimal login page (#4320) (764f359)
- experiments: add the ability to copy experiment IDs to the clipboard (#4317) (589ac03)
- onboarding demo projects (#4262) (74dd3c7)
4.26.0 (2024-08-21)
4.25.0 (2024-08-21)
- auth: add expiry support for system keys (#4296) (8d436a5)
- auth: create system key (#4235) (bae5fbe)
- auth: system api keys ui (#4270) (695fdea)
- Clarify
register
API documentation (#4280) (819236c) - Create
phoenix.otel
package (#4230) (4e2ad61)
- conditionally display re-ranker queries in span details (#4263) (248d61b)
- python: application launch on Windows (#4276) (9ede0a3)
- add haystack to README (a244cdd)
- Add human feedback notebook tutorial (#4257) (d4c200f)
- add LLM fixtures for demo dataset (fine-tuning dataset), fix demo notebook (#4286) (9f54510)
- Add Phoenix Llamaindex RAG Demo notebook + chunks + questions (#4202) (7f1b817)
- fix variable name typo in run experiments doc (#4249) (9745754)
4.24.0 (2024-08-15)
- auth: add user role, exclude system in user lists (#4229) (fb18ab6)
- auth: user / system api key resolvers (#4232) (c7b939e)
- experiments: ability to specify concurrency in run_experiment and evaluate_experiment (#4189) (8239d3a)
4.23.0 (2024-08-13)
- auth: plumb auth_enabled flag, add a settings page (#4213) (7f66f0b)
- auth: user gql query (#4219) (46543be)
- auth: users table in settings (#4221) (803399f)
- Propagate span annotation metadata to examples on all mutations (#4195) (181e021)
- UI: show IO if embedding span is missing embeddings (#4218) (5bc97ff)
4.22.1 (2024-08-12)
- experiments:
evaluate_experiment
on existing experiment runs (#4204) (515e195) - remove skep_deps_check param on phoenix.instrumentors (#4205) (7a9ad5e)
4.22.0 (2024-08-09)
4.21.0 (2024-08-08)
4.20.2 (2024-08-07)
4.20.1 (2024-08-06)
- cache invalidation (#4138) (d75dc8a)
- ensure span annotations appear in sorted order by name (#4144) (ff2e4b9)
4.20.0 (2024-08-06)
- use dataloader for span annotations ([#4006]) (ab53325)
- ensure rest api urls include custom root path (#4137) (9550a7e)
- add more videos to docs (GITBOOK-787) (cb9ee71)
- Added Prompt flow documentation with example (GITBOOK-781) (2772397)
- minor fixes to the quickstart (GITBOOK-786) (04a8ea0)
- Update Tracing Integrations to match standard format (GITBOOK-784) (dedf969)
4.19.0 (2024-08-02)
4.18.0 (2024-08-02)
- Add annotation summaries to projects (#4108) (5aa79c4)
- annotations: add ability to edit human span annotations (#4111) (67cb9a2)
- session: support a slug to the seesion.view (#4114) (9305f8a)
4.17.0 (2024-08-02)
- annotations: add feedback column to spans / traces tables with all annotations (#4100) (193b309)
- annotations: update RetrievalEvaluationLabel styles to match AnnotationLabel (#4101) (eef32df)
- ui: condensed trace tree (#4099) (548f685)
4.16.1 (2024-07-31)
4.16.0 (2024-07-30)
- Add sort order argument to Span- and Trace- Annotation fields (#4079) (cf3b37c)
- add trace stream toggle into preferences context (#4035) (bc3be3e)
- allow retries for annotation insertions when the corresponding span/trace does not exist (#4026) (13af3b5)
- annotations: add feedback tab to span details (#4069) (8dc9672)
- annotations: default collapse annotation explanations (#4081) (dbf3ee4)
- annotations: make color for evaluation summaries consistent with table (#4082) (70a8b5a)
- annotations: migrate span eval labels to us AnnotationLabel (#4068) (6219e91)
- trace: add a span aside with timing info and feedback (#4071) (275ad73)
- ui: tracing getting started button (#4067) (9eba5eb)
4.15.0 (2024-07-26)
- Add containedInDataset boolean field to gql Spans (#4015) (3c096ca)
- annotations: add annot ation macro and filter condition snippet to project page (#4024) (acc2ff1)
- annotations: refetch annootations on annotation changes (#3980) (9ba7cb9)
- datasets: add dataset edit UI and dataset metadata on create (#4005) (d80c438)
- trace: UI lazy loading of spans (#4014) (ab4fafa)
- Version mismatch checks (#3989) (8454183)
- add mutex for sqlite (#3981) (91f96ef)
- Changes dataset name query from is to equal (#3983) (3f77759)
- move all fixtures from jsonl to parquet (#3943) (0587462)
- remove invalid command from dev:ui script (#3982) (02f264c)
4.14.1 (2024-07-23)
4.14.0 (2024-07-23)
- annotations: annotations UI (#3914) (cd1a48f)
- Extend evals DSL to accept 'annotations' symbol (#3939) (659b674)
- Fix error in LlamaIndex Quickstart (GITBOOK-750) (40c5b28)
- Fix images in Custom Task Evaluation (GITBOOK-749) (ee7365e)
4.13.1 (2024-07-22)
4.13.0 (2024-07-22)
- Add API docstrings for experiment evaluators module (#3944) (53079ce)
- api ref sidebar overhaul (0614255)
- api ref updates and docstring fixes (e089f99)
- small fixes for datasets and experiments quickstart notebook (#3934) (e24d721)
- Update README.md (7836779)
4.12.0 (2024-07-18)
4.11.0 (2024-07-18)
- Add Guardrail span kind type (#3919) (c0180ef)
- annotations: gql resolver for annotations on a span (#3915) (c058bbf)
- flatten sequence attribute when value is
ndarray
(which is notSequence
) (#3926) (a361f87) - initialize tracer provider for internal server instrumentation (#3921) (c59af75)
- security fix for braces (#3924) (c2595c6)
4.10.1 (2024-07-16)
4.10.0 (2024-07-16)
- Add GQL mutations for Span + Trace Annotations (#3891) (78e7e3b)
- Add REST routes for span and trace annotations (#3869) (43eede1)
- annotations: ability to copy span and trace IDs (49085c4)
4.9.0 (2024-07-10)
- graphql: clear project when end_time is UNSET (#3879) (7c77a73)
- remove phoenix.daasets imports (12adc6a)
- api reference overhaul modules (e3b9c7f)
4.8.1 (2024-07-09)
4.8.0 (2024-07-08)
4.7.2 (2024-07-08)
- experiments: do client.post in thread (#3846) (8db5bdc)
- make projects page scrollable (#3756) (56f1374)
4.7.1 (2024-07-04)
4.7.0 (2024-07-03)
4.6.3 (2024-07-03)
4.6.2 (2024-07-02)
4.6.1 (2024-07-02)
- txt2sql (4322b7d)
4.6.0 (2024-07-02)
create_evaluator
decorators (#3642) (56acddd)- ability to clear data older than X date, fix DB constraint errors for span.id from datasets to projects (#3670) (993ad5d)
- add annotations resolver on DatasetRun type (#3473) (c677091)
- Add basic evaluators for string experiment outputs (#3534) (85bec41)
- add dataset-related tables (#3169) (b164dfe)
- add experiment-related tables and migrations (#3381) (b08e8d4)
- add experiments resolver to DatasetExample gql type (#3446) (f526025)
- add graphql resolver for adding spans to datasets (#3205) (b80979e)
- Add LLM evaluators (#3571) (032672b)
- add patchDatasetExamples mutation (#3343) (9ffe198)
- add project resolver on span (#3406) (b64d78b)
- Add relevance evaluator (#3604) (da4a6b3)
- add runs resolver on Experiment type (#3465) (8140957)
- add span resolver on DatasetExample gql type (#3394) (6c46d50)
- auth: ability to set headers via environment variables (ff5b64d)
- compareExperiments resolver (#3481) (2becd18)
- dataset example slideover (#3325) (c64f99b)
- dataset: gql dataset versions connection (#3222) (de28b12)
- datasets: add
reference
as alias ofexpected
for evaluator argument bindings (#3790) (fdd070a) - datasets: add client method for appending to datasets (#3659) (9c444a8)
- datasets: add dataframe transformation to dataset (#3736) (fb5730a)
- datasets: add example modal (#3424) (e52867c)
- datasets: add graphql field from trace to project (#3606) (7a54241)
- datasets: add jsonl to download menu (#3495) (fcd6c27)
- datasets: add pagination to dataset examples table (#3299) (33d7a74)
- datasets: add sequence number for experiments of the same dataset (#3486) (1a692cf)
- datasets: add span to dataset from the trace page (#3230) (945af8c)
- datasets: add the ability to create a dataset dynamically (#3712) (81c0cae)
- datasets: allow unrecognized parameters in the evaluator function with default values (#3674) (8b97a5e)
- datasets: capture traces from experiments and their evaluations (#3579) (1917cd7)
- datasets: create dataset UI (#3217) (5183620)
- datasets: dataset upload endpoint (plus fixtures) (#3183) (626f18d)
- datasets: datasets graphql (#3192) (1697d96)
- datasets: datasets page (#3172) (89305fe)
- datasets: Delete dataset mutation (#3321) (053fa31)
- datasets: Delete dataset UI (#3336) (202e9f8)
- datasets: Delete examples (#3352) (42ab894)
- datasets: delete examples mutation (#3324) (febea33)
- datasets: deny v1 routes and gql mutations if readonly (#3501) (de376cf)
- datasets: Display latest version (#3373) (66cd6a8)
- datasets: download csv button (#3312) (e5b83a2)
- datasets: download dataset as CSV text file (#3250) (9629d39)
- datasets: download jsonl for openai (#3493) (e4412ef)
- datasets: example and experiment count on datasets table (#3447) (2e3413a)
- datasets: example experiment runs (#3476) (db592a8)
- datasets: expose the API playgrounds (#3204) (da1416b)
- datasets: get_dataset_by_name (726d97d)
- datasets: gql dataset create (#3203) (679a868)
- datasets: gql for adding examples (#3266) (4049228)
- datasets: gql resolver for dataset example count (#3437) (862bb1f)
- datasets: gql resolver for experiment count (#3443) (5b6bc5c)
- datasets: gql resolver returns examples in descending order (#3448) (624ba10)
- datasets: JSON endpoint to get dataset versions (#3323) (fec38ff)
- datasets: link to view source span (#3413) (faa925e)
- datasets: multi-select on span / traces tables (#3236) (160c4e6)
- datasets: navigate to examples if no experiments exist (cbbed30)
- datasets: post the result of each experiment/evaluation run immediately when it finishes (#3666) (4e21d2c)
- datasets: print experiment summaries (#3709) (7c70afa)
- datasets: print the URL to the dataset when uploaded (#3647) (76439cf)
- datasets: python instructions (#3569) (ee0788a)
- datasets: routing for examples and experiment pages (#3470) (141b90c)
- datasets: show example details in a slide-over (b1a1317)
- datasets: sort by name and createdAt (79f8c88)
- datasets: sort on version (#3370) (41348cf)
- datasets: spans as examples (#3279) (1d46c42)
- datasets: synchronously upload dataset examples returning
dataset_id
in JSON (#3347) (c32ac4d) - datasets: UI to edit a dataset example (#3376) (3950256)
- datasets: upload JSON for dataset examples (#3658) (47ef311)
- datasets: usability enhancements (#3773) (912dc9b)
- datasets: version history modal (#3444) (86755a4)
- display average run latency in the experiments table (#3743) (cfaafd5)
- error rate resolver on Experiment type (#3588) (ceaea16)
- Experiments improvements (#3638) (bd85bea)
- experiments: add experiment name (#3512) (801ac29)
- experiments: add the ability to view an experiment's traces (#3603) (084a0c6)
- experiments: comparison details slideover (74d1bd0)
- experiments: delete experiments ui (623805c)
- experiments: delete experiments ui (b942b59)
- experiments: detail view for comparison (ebc4aa1)
- experiments: evaluator icon and ingestion (#3639) (70ba085)
- experiments: evaluator trace slide-over (#3680) (2df5b9d)
- experiments: experiment error rate column (#3657) (41d354f)
- experiments: experiment evaluation summaries in the table (#3575) (85c457a)
- experiments: experiments compare table (47af587)
- experiments: experiments table (#3454) (a9981da)
- experiments: full-text toggle for experiments table (537ed97)
- experiments: gql resolver for experiments (#3404) (6d70786)
- experiments: Implement
run_experiment
(#3471) (87a0501) - experiments: navigation to experiments view (#3509) (a293f7e)
- experiments: run count resolver on experiments (#3679) (2444f42)
- experiments: show run count (#3690) (2c79a78)
- experiments: show trace slide-over on experiment page (#3640) (8457cb5)
- experments: ability to view evaluator traces (811290e)
- experments: add the ability to view experiment metadata in full (#3686) (3560e1d)
- experments: minimum viable dialog showing how to run an experiment (#3704) (4fb13b8)
- experments: Switch UI to use experiment name (#3523) (a953231)
- gql resolver for dataset examples (#3238) (fa0b4d2)
- Implement
GET /datasets/id
andGET /datasets
(#3197) (36abede) - Implement experiments REST API (#3411) (d369fb3)
- implement get_dataset method on phoenix.Client (#3490) (09fb3f0)
- implement initial experiment evals (#3526) (b6fabdf)
- implement patchDataset mutation (#3457) (a0240b3)
- Improve task argument binding and document
run_experiment
(#3789) (0b64cbe) - List Dataset Examples (#3271) (d5f4391)
- resolvers for experiment annotation aggregations (#3549) (227e6e0)
- Support repetitions for experiment runs (#3532) (7942694)
- ui: display examples in dataset page (#3277) (829746a)
- Unify
run_experiment
andevaluate_experiment
(#3585) (7e1ffb6)
- add tiebreak to versions resolver (#3488) (ac23ec7)
- Address relevance eval feedback (#3609) (b231169)
- datasets: allow duplicate keys for csv upload (#3464) (a0a5b25)
- datasets: api spec for upload endpoint (#3213) (b719267)
- datasets: bug with json upload (#3663) (d667b8f)
- datasets: colab usage of dataset.examples should no longer be list (#3781) (4f148ae)
- datasets: filter examples by dataset in gql (#3330) (e5606e7)
- datasets: free up the
output
keyword as attribute of experiment run objects (#3793) (6b4db71) - datasets: get metadata as
{}
when its value isNone
in JSON (#3555) (6249ebe) - datasets: json return payload for upload csv endpoint (#3364) (4a1d063)
- datasets: make tests pass with new client (5cfdc5b)
- datasets: missing annotation trace id (#3664) (d800e36)
- datasets: reconcile Dataset methods (#3508) (43db5bc)
- datasets: select nested rows on traces (#3489) (0bdb860)
- datasets: show full bar on evals of all 1s (#3733) (3faa051)
- datasets: squash experiment run output by "result" key for graphql query (#3672) (20dba43)
- datasets: typo on dict type for typed dict (#3684) (5e8e9a3)
- datasets: update span kind for evaluator with semantic conventions v0.1.9 (#3667) (ff2de45)
- ensure patches are sorted in numeric patch order (#3379) (70facf1)
- experiments: Improve the performance of the table (#3732) (8e33b77)
- experments: fix colab links (#3637) (841ac0d)
- fix annotation trace ts errors (8314aa5)
- json cell for experiment metadata (#3556) (f9e2b6d)
- openapi import error (#3619) (1f81c05)
- openapi yaml parsing for containers (#3788) (959abf7)
- order runs in descending order in runs resolver on Experiment type (#3480) (e1818b7)
- resolve sqlachemy warning regarding remote (#3522) (cd15d9b)
- style and type errors (#3540) (2cba662)
- switch to upload_dataset for examples (#3783) (bea7c2f)
- ui: right align numeric columns (#3587) (781ae7a)
- Added more detail prepping and exporting eval data to the Bring Your Own Evaluator section (GITBOOK-704) (96a312b)
- api-ref: fix readthedocs build issues (#3706) (0827726)
- Cleanup datasets section (GITBOOK-694) (18a4d5b)
- Datasets documentaiton (GITBOOK-697) (8148f67)
- Datasets review - fixing typos, syntax, labels, links (GITBOOK-702) (fcb56ee)
- datasets tutorials and quickstart (#3734) (cfa641c)
- datasets: print useful URLs, disable repetitions (#3583) (14c7d9f)
- experiments: prompt template iteration for summarization task (#3669) (0842df4)
- experiments: txt2sql (#3626) (33cd194)
- experiments: txt2sql (#3714) (b083159)
- fix creating datasets (GITBOOK-701) (9b83b1d)
- fix typos (GITBOOK-698) (d413e54)
- GPT-4o first set (GITBOOK-695) (8dff0bf)
- No subject (GITBOOK-696) (88859e1)
- No subject (GITBOOK-699) (9beed78)
- No subject (GITBOOK-700) (5ac466c)
- No subject (GITBOOK-703) (f04e9c5)
- No subject (GITBOOK-707) (2237a88)
- notebook: datasets and experiments quickstart (#3703) (991df49)
- placeholders for experiments (GITBOOK-705) (1f7d183)
- readthedocs (71fceab)
- rest api guidance (#3314) (0309017)
- small fixes (GITBOOK-706) (297458e)
- small fixes (GITBOOK-708) (4990aa5)
- sphinx api-ref for readthedocs (0bcccbd)
- update dataset creation (GITBOOK-711) (51c5ea1)
- use kwargs with datasets (#3748) (530b2c6)
- use kwargs with datasets (#3748) (#3749) (599e340)
4.5.0 (2024-06-21)
4.4.3 (2024-06-17)
4.4.2 (2024-06-13)
4.4.1 (2024-06-11)
4.4.0 (2024-06-10)
4.3.1 (2024-06-10)
4.3.0 (2024-06-07)
- Adds timing info to llm_classify (#3377) (3e2785f)
- Serializable execution details (#3358) (fc74513)
- ui: display input and output for tool spans (if available) (#3396) (73312dc)
- add separate package installations to notebooks (#3393) (914e3fe)
- filter out undefined (#3383) (e3a2d31)
- percentage sign for alembic configparser (#3403) (87bcd59)
4.2.4 (2024-05-28)
4.2.3 (2024-05-23)
4.2.2 (2024-05-23)
4.2.1 (2024-05-23)
4.2.0 (2024-05-23)
4.1.3 (2024-05-22)
4.1.2 (2024-05-20)
4.1.1 (2024-05-17)
4.1.0 (2024-05-17)
- bump base image in kustomize (#3193) (5e8bc3d)
- PHOENIX_WORKING_DIR default value documentation (#3190) (6957bd9)
4.0.3 (2024-05-13)
4.0.2 (2024-05-11)
- Bulk inserter begins first insert immediately (#3151) (7e17cb2)
- unflatten attributes when loading spans from
trace_dataset
(#3170) (a165023)
4.0.1 (2024-05-09)
4.0.0 (2024-05-09)
⚠ BREAKING CHANGES
- Remove experimental module (#2945)
- Add log_traces method that sends TraceDataset traces to Phoenix (#2897) (c8f9ed2)
- add a last N time range selector on project / projects pages (#2907) (3c115f8)
- add bedrock claude tracing tutorial (#2919) (b8b5240)
- add default limit to /v1/spans and corresponding client methods (#3026) (e5698d7)
- add gradient start/end to projects table (#2956) (5b6b217)
- add grpc endpoint (#2232) (8bbd136)
- Add indexes on Annotation tables (#3082) (682ecee)
- Add indexes on spans table (#3098) (12d2574)
- add opentelemetry trace instrumentation for Phoenix server (#2990) (6ed494e)
- Add SQL and Code Functionality Eval Templates (#2861) (c7d776a)
- add trace and document evals to GET v1/evaluations (#2910) (79229f2)
- Add user frustration eval (#2928) (406938b)
- Added support for default_headers for azure_openai. (#2917) (6ee5f24)
- convert graphql api to pull trace evaluations from db (#2867) (11aa455)
- Deprecate datasets module, rename to inferences (#2785) (4987ea3)
- experimental: postgres support (a2657d4)
- fetch annotation names (#2964) (6c5d25d)
- fetch document retrieval metrics per span using SQL (#2960) (9fdb765)
- graphql api pulls from db for document evaluations (#2865) (e4b667d)
- grpc interceptor for prometheus (#3056) (610c8fa)
- ingest document evals (#2847) (f3fde50)
- ingest pyarrow span evals into sqlite (#2837) (3a6666c)
- ingest trace annotations (#2852) (792f674)
- make graphql api for span evaluations read from database (#2860) (5adf750)
- move document evaluation summary to pull from db (#2888) (73ca2d7)
- openapi ui for api exploration (#3041) (5b22961)
- persistence: add support for sorting by eval scores and labels (#2977) (44c3068)
- persistence: bulk inserter for spans (#2808) (9ce841e)
- persistence: clear project (#2976) (665c166)
- persistence: clear traces UI (#2988) (a717ff6)
- persistence: dataloader for document retrieval metrics (#2978) (f55c458)
- persistence: dataloader for span descendants (#2980) (d8e10d4)
- persistence: ensure migrations run for TreadSession (#2855) (ec4fea7)
- persistence: fetch latency_ms percentiles using sql with dataloaders (#2818) (48d4643)
- persistence: fetch streaming_last_updated_at (#2819) (d665e49)
- persistence: get or delete projects using sql (#2839) (527b9a9)
- persistence: json binary for postgres (#2849) (29351bf)
- persistence: launch app with persist (#2817) (add6103)
- persistence: make launch_app runnable on tmp directory (#2851) (f41e922)
- persistence: span annotation tables (#2788) (874c61e)
- persistence: span query DSL with SQL (#2911) (7c01420)
- persistence: sql sorting for spans (#2823) (eeafb64)
- persistence: use sqlean v3.45.1 as sqlite engine (#2947) (3b202d7)
- Remove experimental module (#2945) (01758cf)
- restrict project metrics to be last 7 days (#2896) (066bc16)
- span filtering by span evaluations (#2923) (4458ec4)
- Support basic auth (#3061) (3202256)
- support for span evaluations to get evaluations endpoint (#2900) (379e336)
- support pagination on spans resolver (#3046) (2113c5c)
- Update API for OpenAPI compliance (#2866) (0db65d8)
- Update eval summaries to use persistence (#2920) (06eb320)
- add the remainder of the sentence (#2903) (64874b8)
- backward compatible truthiness for query from dict parsing (#3124) (b425f9d)
- cartesian product in sql join (#2959) (c96092d)
- cartesian products in get_evaluations (#3081) (64ebec8)
- check payload for legacy project_name (#3125) (d7eae60)
- close delete modal on delete (#3069) (083a467)
- commit insert into alembic_version (#3115) (93a144f)
- disable client-side sorting on trace/span tables (#2958) (139dc3e)
- disable grpc when readonly (#3105) (71ceba9)
- Dockerfile launches Phoenix that listens on IPv6 (#3047) (75cc979)
- eliminate interference on global tracer provider (#2998) (5d7b843)
- Enable listening on IPv6 (#3037) (dee6681)
- ensure recent version of opentelemetry-proto is used (#2948) (33647f5)
- evals: incorrect wording in hallucinations (#3085) (7aa0292)
- fix docker build for sql (b6d508d)
- forbid blank or empty evaluation names (#2962) (cb87977)
- improve error handling and logging for eval insertions (#2854) (d04694b)
- include migration files (#2887) (b0a772e)
- Invalidate cache on project reset (#3113) (2944ae5)
- normalize datetime for phoenix client (#3088) (94a25ae)
- normalize telemetry url before setup (#3001) (28389e8)
- persistence: db race condition between spans and evals (#2905) (2666464)
- persistence: import asert_never from typing_extensions (#2850) (62644cb)
- persistence: postgres down migration and url support (#2915) (4b4a776)
- persistence: postgres json calculations (#2848) (45f084d)
- persistence: postgres timestamp insertion (#2844) (3477bb9)
- preserve loggers across migrations (#2835) (2821bb4)
- prometheus transaction timers for bulkloader (#3066) (e0cc58d)
- Propagate migration errors and show an informative message (#2994) (3718e10)
- remove broken non-asyncio prometheus grpc server interceptor (#3065) (af75151)
- round down time points to facilitate caching (#3079) (42b03c9)
- run docker as nonroot user (#3100) (c640678)
- safely unpack Evaluations proto in bulk inserter (#2869) (50517f7)
- span and trace evaluation summaries (#3013) (088e6c2)
- span event to dict conversion (#3009) (3c73f03)
- switch license format in toml (5c6f345)
- typo in SpanAnnotation (#2967) (f41044e)
- typo in trace annotation table name (#2946) (344b858) Documentation
- Add log_traces tutorial (#2902) (e583f03)
- development: make it explicit that you need to run pnpm build (#3035) (672cbed)
- dockerize manual instrumentation example (#2797) (651efbe)
- manually instrumented chatbot (#2730) (46be32b)
- remove experimental tags in code (4c4a832)
3.25.0 (2024-05-06)
- evals: incorrect wording in hallucinations (#3085) (7aa0292)
- run docker as nonroot user (#3100) (c640678)
3.24.0 (2024-04-22)
3.23.0 (2024-04-19)
3.22.0 (2024-04-16)
3.21.0 (2024-04-12)
3.20.0 (2024-04-10)
- dockerize manual instrumentation example (#2797) (651efbe)
- remove experimental tags in code (4c4a832)
3.19.4 (2024-04-04)
- switch license format in toml (5c6f345)
- fix qa with reference tutorial (e1db1ce)
- fix qa with reference tutorial (ba24950)
- make dockerhub URL go to public (6650f67)
- manually instrumented chatbot (#2730) (46be32b)
3.19.3 (2024-03-30)
3.19.2 (2024-03-29)
- ui: broken context for markdown (556e901)
3.19.1 (2024-03-29)
- UI: color rotation for markdown (3184359)
3.19.0 (2024-03-29)
3.18.1 (2024-03-28)
- ignore docs/ directory when formatting (#2714) (1340f74)
- repair frontend build step in release pipeline (#2716) (796eb6a)
3.18.0 (2024-03-28)
3.17.1 (2024-03-24)
- Add mistral (GITBOOK-594) (78676af)
- add mistral instrumentation to notebook (#2681) (54dc47d)
- add mistral instrumentor to mistral tutorial (#2682) (13fc1f8)
- Evals Structure! (GITBOOK-547) (ac23311)
- fix missing parentheses (GITBOOK-571) (2353953)
- Mistral (GITBOOK-595) (f245844)
- No subject (GITBOOK-597) (b6196ac)
- No subject (GITBOOK-598) (f6a2bd6)
- Remove pinecone notebook (#2665) (9f1c1d4)
- trace a deployed app (GITBOOK-593) (08623ea)
3.17.0 (2024-03-21)
- Add
response_format
argument toMistralAIModel
(#2660) (7da51af) - evals: Add Mistral as an eval model (#2640) (c13ab6b)
3.16.3 (2024-03-20)
3.16.2 (2024-03-20)
- trace: redefine root span (#2632) (7940c9d)
- ui: increase pagination size for TracePage (#2642) (6cd456f)
3.16.1 (2024-03-19)
3.16.0 (2024-03-15)
3.15.1 (2024-03-15)
3.15.0 (2024-03-14)
- launch_app() with experimental span storage using environment variables for storage path and storage type enums (#2564) (8a0b572)
- project archiving and deletion (#2585) (121f904)
- projects: the home page should direct you to the projects page if there are multiple projects with data (#2586) (ced4e75)
- use environment variable for project name (#2590) (e2ace76)
3.14.2 (2024-03-14)
- increase attributes limit on spans (#2575) (94b1930)
- support numpy arrays in span to json encoder (#2583) (3a297d5)
3.14.1 (2024-03-14)
3.14.0 (2024-03-14)
- experimental span storage with append-only text files (909672b)
- experimental span storage with append-only text files (#2553) (909672b)
3.13.1 (2024-03-13)
3.13.0 (2024-03-13)
3.12.0 (2024-03-13)
3.11.1 (2024-03-12)
3.11.0 (2024-03-11)
- graphql: embed project inside graphql span as private attribute (#2522) (9be1afa)
- trace: context manager to pause tracing (#2520) (6bf7232)
- Update pyproject.toml with proper biline (4fdf710)
3.10.0 (2024-03-09)
- projects: add support for the PHOENIX_PROJECT_NAME param (#2515) (6f24786)
- show first non-empty project (#2508) (54a2834)
3.9.0 (2024-03-08)
3.8.0 (2024-03-07)
The Phoenix evals
module is graduating out of experimental
! You can now install Phoenix evals as a standalone package with pip install arize-phoenix-evals
or you can include the new version of phoenix.evals
along with the Phoenix install with pip install -U arize-phoenix[evals]
. Swapping to the new evals
module includes a few small breaking changes which might require some migration work. Details can be found in MIGRATION.md
.
phoenix.experimental.evals
is being deprecated and will remain in Phoenix for about a month before being removed.
- gql: add trace count to gql project (#2484) (91b4ae1)
- Integrate
phoenix.evals
intophoenix
(#2420) (dd3e7b4)
3.7.0 (2024-03-07)
3.6.0 (2024-03-06)
- traces: store and query spans by project name (#2433) (b8ef923)
- ui: auto-expand side nav on hover (#2458) (da83f69)
3.5.0 (2024-03-05)
- add metadata to spans and traces table (#2339) (e9725a2)
- Removes token processing module from
phoenix.evals
(#2421) (fbd4961) - ui: new side nav with projects (#2359) (d8c423e)
- Properly define
BedrockModel
(#2425) (81a720c) - remove computed atributes from exported dataframe (#2366) (1de1415)
- turn span_kind enums into string because it's not serializable by pyarrow (#2438) (50c7eb0)
- update rag and llm ops notebooks (#2442) (adf1b2b)
- evals: update tracing tutorials with arize-phoenix-evals (#2386) (1af8187)
- log information about the server at startup (#2445) (6d410c1)
- update readme for phoenix.evals, fix llama-index example (#2435) (dfffaad)
3.4.1 (2024-02-29)
- remove symbolic links for docker build (#2408) (b57abe9)
- source distribution build (#2407) (1e67d7e)
3.4.0 (2024-02-28)
- remove run_relevance_evals and fix import issues (#2375) (9a97e62)
- traces: add y scroll on trace tree (#2399) (9c4f6b9)
- evals: add README (#2363) (47842da)
- evals: migrate evaluation notebooks (#2388) (3dedc6e)
- update ragas integration (#2400) (7bebe98)
3.3.0 (2024-02-23)
- display status description under trace info (#2334) (aed925f)
- show span as soon as they arrive (#2353) (88397a5)
3.2.1 (2024-02-16)
3.2.0 (2024-02-16)
- px.Client
log_evaluations
(#2308) (69a4b2b) - trace: display metadata in the trace page UI (#2304) (fce2d63)
3.1.2 (2024-02-15)
- allow json string for
metadata
span attribute (#2301) (ec7fbe2) - ui: safely parse JSON and fallback to string for span attributes (#2293) (e43cdbb)
3.1.1 (2024-02-15)
- fix: cast message to string in vertexai model (86947a2)
3.1.0 (2024-02-15)
- set global session to None if it fails to start (#2286) (6752fd2)
- trace: Make dataset IDs unique by instance for TraceDataset (#2254) (1ac170f)
3.0.3 (2024-02-13)
3.0.2 (2024-02-13)
3.0.1 (2024-02-09)
3.0.0 (2024-02-09)
- replace Phoenix tracers with OpenInference instrumentors (#2190)
2.11.1 (2024-02-09)
2.11.0 (2024-02-08)
2.10.0 (2024-02-07)
- endpoint for client inside ProcessSession (#2211) (82e279e)
- trace: return to /tracing url when dismissing trace slide over (#2222) (ee4ced3)
- traces: warn if collector endpoint is set but launch app is called (#2209) (eb97b8d)
2.9.4 (2024-02-06)
2.9.3 (2024-02-05)
2.9.2 (2024-02-05)
2.9.1 (2024-02-05)
2.9.0 (2024-02-05)
- phoenix client
get_evaluations()
andget_trace_dataset()
(#2154) (29800e4) - phoenix client
get_spans_dataframe()
andquery_spans()
(#2151) (e44b948)
2.8.0 (2024-02-02)
- broken link and openinference links (#2144) (01fb046)
- databricks check crashes in python console (#2152) (5aeeeff)
- default collector endpoint breaks on windows (#2161) (f1a2007)
- Do not retry when context window has been exceeded (#2126) (ff6df1f)
- remove hyphens from span_id in legacy evaluation fixtures (#2153) (fae859d)
- add docker badge (e584ed8)
- Add terminal running steps (GITBOOK-441) (91c6b24)
- No subject (GITBOOK-442) (5c4eb6c)
- No subject (GITBOOK-443) (11f46cb)
- No subject (GITBOOK-444) (fcf2bc9)
- update badge (ddcecea)
- update prompt to reflect rails (GITBOOK-445) (dea6dd6)
2.7.0 (2024-01-24)
2.6.0 (2024-01-23)
- add ability to save and load TraceDatasets (#2082) (60c5e5e)
- add get_trace_dataset method to session (#2107) (9754b60)
- evals: Gpt 4 turbo context window size (#2112) (389c1a0)
- launch phoenix with evaluations (#2095) (9656d0c)
- support eval exports for session (#2094) (8757fa8)
- Clean up vertex clients after event loop closure (#2102) (202c7ea)
- Determine default async concurrency on a per-model basis (#2096) (b44d8aa)
- Resolves Bedrock model compatibility issues (#2114) (c4a5343)
- show localhost when the notebook is running locally (#2090) (095298d)
- evals: update RAG evaluations notebook (#2092) (9ad797a)
- evals: update ragas integration notebook (#2100) (66fb048)
2.5.0 (2024-01-16)
2.4.1 (2024-01-11)
- traces: prevent missing key exception when extracting invocation parameters in llama-index (#2076) (5cc9560)
2.4.0 (2024-01-10)
- add persistence for span evaluations (#2021) (589d482)
- ui: add filter condition snippets (#2049) (567fa54)
- Handle missing vertex candidates (#2055) (1d0475a)
- OpenAI clients are not cleaned up after calls to
llm_classify
(#2068) (3233d56) - traces: remove nan from log_evaluations (#2056) (df9ed5c)
2.3.0 (2024-01-08)
- Add demo link, examples getting started (GITBOOK-396) (e987315)
- Add Evaluating Traces Section (GITBOOK-386) (7d72029)
- Add evaluations section for results (GITBOOK-387) (2e74be0)
- Add final thoughts to evaluation (GITBOOK-405) (20eab16)
- add import statement (GITBOOK-408) (23247d7)
- add link (GITBOOK-403) (0be280a)
- eval concepts typo (GITBOOK-394) (7c80d4b)
- eval concepts typos (GITBOOK-393) (62bc99f)
- evaluation concepts typo fix (GITBOOK-390) (2cbc1dc)
- Extract Data from Spans (GITBOOK-383) (440f530)
- fix broken section link (GITBOOK-409) (fee537b)
- fix typos (GITBOOK-391) (c8f5a55)
- fix typos (GITBOOK-402) (3cd973d)
- fix typos (GITBOOK-406) (eaa9bea)
- fix typos (GITBOOK-407) (cad4820)
- Initial draft of evaluation core concept (GITBOOK-385) (67369cf)
- Log Evaluations (GITBOOK-389) (369d79d)
- No subject (GITBOOK-399) (94df884)
- Re-arrange nav (GITBOOK-398) (54a87eb)
- Remove the word golden, simplify title (GITBOOK-395) (a2233b2)
- simplify conceps (GITBOOK-384) (c38f6c2)
- Simplify examples page (GITBOOK-400) (6144158)
- Trace Evaluations Section (GITBOOK-388) (2ffa800)
- Update SECURITY.md (#2029) (363e891)
2.2.1 (2023-12-28)
- Do not retry if eval was successful when using SyncExecutor (#2016) (a869190)
- ensure float values are properly encoded by otel tracer (#2024) (b12a894)
- ensure llamaindex spans are correctly encoded (#2023) (3ca6262)
- Use separate versioning file (#2020) (f38eedf)
2.2.0 (2023-12-22)
- Add support for Google's Gemini models via Vertex python sdk (#2008) (caf826c)
- Support first-party Anthropic python SDK (#2004) (a323283)
2.1.0 (2023-12-21)
- instantiate evaluators by criteria (#1983) (9c72616)
- support function calling for run_evals (#1978) (8be325c)
- traces: add
v1/traces
HTTP endpoint to handleExportTraceServiceRequest
(3c94dea) - traces: add
v1/traces
HTTP endpoint to handleExportTraceServiceRequest
(#1968) (3c94dea) - traces: add retrieval summary to header (#2006) (8af0582)
- traces: evaluation summary on the header (#2000) (965beb0)
2.0.0 (2023-12-20)
- Update
llm_classify
andllm_generate
interfaces (#1974)
- Add async submission to
llm_generate
(#1965) (5999133) - add support for explanations to run_evals (#1975) (5143529)
- evaluation column selectors (#1932) (ed07809)
- openai streaming tool calls (#1936) (6dd14cf)
- support running multiple evals at once (#1742) (79d4473)
- Update
llm_classify
andllm_generate
interfaces (#1974) (9fd35a1)
- Add lock failsafe (#1956) (9ddbd9c)
- llama-index extra (#1958) (d9b68eb)
- LlamaIndex compatibility fix (#1940) (052349d)
- Model stability enhancements (#1939) (dca42e0)
- traces: span summary root span filter (#1981) (d286f07)
- Add anyscale tutorial (#1941) (e47c8d0)
- autogen link (#1946) (c3fb4ce)
- Clear anyscale tutorial outputs (#1942) (63580a6)
- RAG Evaluation (GITBOOK-378) (429f537)
- sync (#1947) (c72bbac)
- traces: autogen tracing tutorial (#1945) (0fd02ff)
- update rag eval notebook (#1950) (d06b8b7)
- update rag evals docs (#1954) (aa6f36a)
- Using phoenix with HuggingFace LLMs- getting started (#1916) (b446972)
1.9.0 (2023-12-11)
1.8.0 (2023-12-10)
- embeddings: audio support (#1920) (61cc550)
- openai streaming function call message support (#1914) (25279ca)
1.7.0 (2023-12-09)
- Instrument LlamaIndex streaming responses (#1901) (f46396e)
- openai async streaming instrumentation (#1900) (06d643b)
- traces: query spans into dataframes (#1910) (6b51435)
1.6.0 (2023-12-08)
- openai streaming spans show up in the ui (#1888) (ffa1d41)
- support instrumentation for openai synchronous streaming (#1879) (b6e8c73)
- traces: display document retrieval metrics on trace details (#1902) (0c35229)
- traces: filterable span and document evaluation summaries (#1880) (f90919c)
- traces: graphql query for document evaluation summary (#1874) (8a6a063)
1.5.1 (2023-12-06)
1.5.0 (2023-12-06)
- evals: Human vs AI Evals (#1850) (e96bd27)
- semantic conventions for
tool_calls
array in OpenAI ChatCompletion messages (#1837) (c079f00) - support asynchronous chat completions for openai instrumentation (#1849) (f066e10)
- traces: document retrieval metrics based on document evaluation scores (#1826) (3dfb7bd)
- traces: document retrieval metrics on trace / span tables (#1873) (733d233)
- traces: evaluation annotations on traces for associating spans with eval metrics (#1693) (a218a65)
- traces: server-side span filter by evaluation result values (#1858) (6b05f96)
- traces: span evaluation summary (aggregation metrics of scores and labels) (#1846) (5c5c3d6)
- allow streaming response to be iterated by user (#1862) (76a2443)
- trace dataset to disc (#1798) (278d344)
1.4.0 (2023-11-30)
- propagate error status codes to parent spans for improved visibility into trace exceptions (#1824) (1a234e9)
1.3.0 (2023-11-30)
- Add OpenAI Rate limiting (#1805) (115e044)
- evals: show span evaluations in trace details slideout (#1810) (4f0e4dc)
- evaluation ingestion (no user-facing feature is added) (#1764) (7c4039b)
- feature flags context (#1802) (a2732cd)
- Implement asynchronous submission for OpenAI evals (#1754) (30c011d)
- reference link correctness evaluation prompt template (#1771) (bf731df)
- traces: configurable endpoint for the exporter (#1795) (8515763)
- traces: display document evaluations alongside the document (#1823) (2ca3613)
- traces: server-side sort of spans by evaluation result (score or label) (#1812) (d139693)
- traces: show all evaluations in the table" (#1819) (2b27333)
- traces: Trace page header with latency, status, and evaluations (#1831) (1d88efd)
- enhance llama-index callback support for exception events (#1814) (8db01df)
- pin llama-index temporarily (#1806) (d6aa76e)
- remove sklearn metrics not available in sagemaker (#1791) (20ab6e5)
- traces: convert (non-list) iterables to lists during protobuf construction due to potential presence of ndarray when reading from parquet files (#1801) (ca72747)
- traces: make column selector sync'd between tabs (#1816) (125431a)
- Environment documentation (GITBOOK-370) (dbbb0a7)
- Explanations (GITBOOK-371) (5f33da3)
- No subject (GITBOOK-369) (656b5c0)
- sync for 1.3 (#1833) (4d01e83)
- update default value of variable in run_relevance_eval (GITBOOK-368) (d5bcaf8)
1.2.1 (2023-11-18)
- make the app launchable when nest_asyncio is applied (#1783) (f9d5085)
- restore process session (#1781) (34a32c3)
1.2.0 (2023-11-17)
- Add dockerfile (#1761) (4fa8929)
- evals: return partial results when llm function is interrupted (#1755) (1fb0849)
- LiteLLM model support for evals (#1675) (5f2a999)
- sagemaker nobebook support (#1772) (2c0ffbc)
1.1.1 (2023-11-16)
1.1.0 (2023-11-14)
- Evals with explanations (#1699) (2db8141)
- evals: add an output_parser to llm_generate (#1736) (6408dda)
1.0.0 (2023-11-10)
- models: openAI 1.0 (#1716)
0.1.1 (2023-11-09)
0.1.0 (2023-11-08)
- add long-context evaluators, including map reduce and refine patterns (#1710) (0c3b105)
- traces: span table column visibility controls (#1687) (559852f)