Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Install] failed to load source map #145

Open
stefnestor opened this issue Jul 5, 2024 · 0 comments
Open

[Install] failed to load source map #145

stefnestor opened this issue Jul 5, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@stefnestor
Copy link

What happened?

Hello! Your tool looks cool, thanks for building it. I'm experiencing errors installing which I believe relate to historical #27 and/or #91 .

It appears toggling Embedding Model sometimes resolves. So this is potentially a FYI as I'm not sure how to make it work 100% of the time, but retrying until it works is okay for my use case.

Error Statement

No response

Steps to Reproduce

  1. Environment: Apple Mac Sonoma 14.5 with M2 running Obsidian v1.6.5 running Ollama installed via Brew in debug mode

    $ which ollama
    /opt/homebrew/bin/ollama
    $ ollama --version
    Warning: could not connect to a running Ollama instance
    Warning: client version is 0.1.38
    $ OLLAMA_DEBUG="1" ollama serve 
  2. Command palette "Open sandbox vault" > Settings > Community Plugins > "Turn on community plugins" > install "Smart Second Brain" v1.3.0 > enable > exit settings > command palette "Smart Second Brain: Open Chat" > follow default setup flow for "Run on your machine"
    image

  3. Click "Start your smart second brain" > (indexes vault) > send "test" to RAG-AI & receive HTTP 500 error. Quit+re-open Obsidian and receive DevTools>Console error. Send "test" again & receive another HTTP 500 error. See ((A)) for correlating Obsidian debug logs.

    DevTools failed to load source map: Could not load content for file:///home/runner/work/obsidian-Smart2Brain/obsidian-Smart2Brain/build/smart-second-brain/main.js.map: Unexpected end of JSON input
    
    image
    Failed to run Smart Second Brain (Error: ,Error: Ollama call failed with status code 500: unsupported model format,). Please retry.
    
  4. Per previous, I do have Excalidraw enabled in my home vault but not within this sandbox.
    image
    image

  5. If you toggle the plugin and/or RAG-AI (Octupus highlighting purple/not) off/on, eventually you may encounter similar to previous where console reports

    Assistant: Failed to run Smart Second Brain (Error: ,Error: Expected a Runnable, function or object. Instead got an unsupported type.,). Please retry.
    
  6. Testing settings port over to llama2-uncensored & reindexing ... 👀 It worked this time ((B))
    image

((B))

  • Initial start-up always fails and errors only resolve after toggling Settings > Community Plugins > Smart Second Brain > Embedding Model. It doesn't matter what you toggle it to & you can toggle it back after. The error sometimes repeats but further toggling resolves, which is confusing for me to understand where to check
  • 🙋‍♀️ ((B2)) A slightly confusing point if I may request confirmation, if two vaults (both without Excalidraw installed) use the same Chat+Embedding Models, their answers appear to cross-pollinate across vaults.
((A))
time=2024-07-04T18:03:09.856-06:00 level=DEBUG source=gguf.go:57 msg="model = &llm.gguf{containerGGUF:(*llm.containerGGUF)(0x140005284c0), kv:llm.KV{}, tensors:[]*llm.Tensor(nil), parameters:0x0}"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=sched.go:153 msg="loading first model" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=memory.go:44 msg=evaluating library=metal gpu_count=1 available="21.3 GiB"
time=2024-07-04T18:03:09.973-06:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=13 memory.available="21.3 GiB" memory.required.full="862.9 MiB" memory.required.partial="862.9 MiB" memory.required.kv="24.0 MiB" memory.weights.total="260.9 MiB" memory.weights.repeating="216.1 MiB" memory.weights.nonrepeating="44.7 MiB" memory.graph.full="48.0 MiB" memory.graph.partial="48.0 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=sched.go:565 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 gpu=0 available=22906503168 required="862.9 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=server.go:100 msg="system memory" total="32.0 GiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=memory.go:44 msg=evaluating library=metal gpu_count=1 available="21.3 GiB"
time=2024-07-04T18:03:09.973-06:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=13 memory.available="21.3 GiB" memory.required.full="862.9 MiB" memory.required.partial="862.9 MiB" memory.required.kv="24.0 MiB" memory.weights.total="260.9 MiB" memory.weights.repeating="216.1 MiB" memory.weights.nonrepeating="44.7 MiB" memory.graph.full="48.0 MiB" memory.graph.partial="48.0 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal
time=2024-07-04T18:03:09.974-06:00 level=INFO source=server.go:320 msg="starting llama server" cmd="/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal/ollama_llama_server --model /Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 13 --verbose --parallel 1 --port 51135"
time=2024-07-04T18:03:09.974-06:00 level=DEBUG source=server.go:335 msg=subprocess environment="[PATH=/Applications/Sublime Text.app/Contents/SharedSupport/bin:/usr/local/bin:/usr/local/sbin:/usr/local/opt/python/libexec/bin:/usr/local/sbin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Library/Apple/usr/bin:/Applications/Wireshark.app/Contents/MacOS:/Applications/iTerm.app/Contents/Resources/utilities LD_LIBRARY_PATH=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal]"
time=2024-07-04T18:03:09.975-06:00 level=INFO source=sched.go:338 msg="loaded runners" count=1
time=2024-07-04T18:03:09.975-06:00 level=INFO source=server.go:504 msg="waiting for llama runner to start responding"
time=2024-07-04T18:03:09.975-06:00 level=INFO source=server.go:540 msg="waiting for server to become available" status="llm server error"
INFO [main] build info | build=2770 commit="952d03d" tid="0x1fe8b0c00" timestamp=1720137789
INFO [main] system info | n_threads=6 n_threads_batch=-1 system_info="AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | " tid="0x1fe8b0c00" timestamp=1720137789 total_threads=10
INFO [main] HTTP server listening | hostname="127.0.0.1" n_threads_http="9" port="51135" tid="0x1fe8b0c00" timestamp=1720137789
llama_model_loader: loaded meta data with 24 key-value pairs and 112 tensors from /Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = nomic-bert
llama_model_loader: - kv   1:                               general.name str              = nomic-embed-text-v1.5
llama_model_loader: - kv   2:                     nomic-bert.block_count u32              = 12
llama_model_loader: - kv   3:                  nomic-bert.context_length u32              = 2048
llama_model_loader: - kv   4:                nomic-bert.embedding_length u32              = 768
llama_model_loader: - kv   5:             nomic-bert.feed_forward_length u32              = 3072
llama_model_loader: - kv   6:            nomic-bert.attention.head_count u32              = 12
llama_model_loader: - kv   7:    nomic-bert.attention.layer_norm_epsilon f32              = 0.000000
llama_model_loader: - kv   8:                          general.file_type u32              = 1
llama_model_loader: - kv   9:                nomic-bert.attention.causal bool             = false
llama_model_loader: - kv  10:                    nomic-bert.pooling_type u32              = 1
llama_model_loader: - kv  11:                  nomic-bert.rope.freq_base f32              = 1000.000000
llama_model_loader: - kv  12:            tokenizer.ggml.token_type_count u32              = 2
llama_model_loader: - kv  13:                tokenizer.ggml.bos_token_id u32              = 101
llama_model_loader: - kv  14:                tokenizer.ggml.eos_token_id u32              = 102
llama_model_loader: - kv  15:                       tokenizer.ggml.model str              = bert
llama_model_loader: - kv  16:                      tokenizer.ggml.tokens arr[str,30522]   = ["[PAD]", "[unused0]", "[unused1]", "...
llama_model_loader: - kv  17:                      tokenizer.ggml.scores arr[f32,30522]   = [-1000.000000, -1000.000000, -1000.00...
llama_model_loader: - kv  18:                  tokenizer.ggml.token_type arr[i32,30522]   = [3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  19:            tokenizer.ggml.unknown_token_id u32              = 100
llama_model_loader: - kv  20:          tokenizer.ggml.seperator_token_id u32              = 102
llama_model_loader: - kv  21:            tokenizer.ggml.padding_token_id u32              = 0
llama_model_loader: - kv  22:                tokenizer.ggml.cls_token_id u32              = 101
llama_model_loader: - kv  23:               tokenizer.ggml.mask_token_id u32              = 103
llama_model_loader: - type  f32:   51 tensors
llama_model_loader: - type  f16:   61 tensors
llm_load_vocab: mismatch in special tokens definition ( 7104/30522 vs 5/30522 ).
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = nomic-bert
llm_load_print_meta: vocab type       = WPM
llm_load_print_meta: n_vocab          = 30522
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: n_ctx_train      = 2048
llm_load_print_meta: n_embd           = 768
llm_load_print_meta: n_head           = 12
llm_load_print_meta: n_head_kv        = 12
llm_load_print_meta: n_layer          = 12
llm_load_print_meta: n_rot            = 64
llm_load_print_meta: n_embd_head_k    = 64
llm_load_print_meta: n_embd_head_v    = 64
llm_load_print_meta: n_gqa            = 1
llm_load_print_meta: n_embd_k_gqa     = 768
llm_load_print_meta: n_embd_v_gqa     = 768
llm_load_print_meta: f_norm_eps       = 1.0e-12
llm_load_print_meta: f_norm_rms_eps   = 0.0e+00
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 3072
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 0
llm_load_print_meta: pooling type     = 1
llm_load_print_meta: rope type        = 2
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 1000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx  = 2048
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: model type       = 137M
llm_load_print_meta: model ftype      = F16
llm_load_print_meta: model params     = 136.73 M
llm_load_print_meta: model size       = 260.86 MiB (16.00 BPW)
llm_load_print_meta: general.name     = nomic-embed-text-v1.5
llm_load_print_meta: BOS token        = 101 '[CLS]'
llm_load_print_meta: EOS token        = 102 '[SEP]'
llm_load_print_meta: UNK token        = 100 '[UNK]'
llm_load_print_meta: SEP token        = 102 '[SEP]'
llm_load_print_meta: PAD token        = 0 '[PAD]'
llm_load_print_meta: CLS token        = 101 '[CLS]'
llm_load_print_meta: MASK token       = 103 '[MASK]'
llm_load_print_meta: LF token         = 0 '[PAD]'
llm_load_tensors: ggml ctx size =    0.11 MiB
ggml_backend_metal_buffer_from_ptr: allocated buffer, size =   260.88 MiB, (  260.94 / 21845.34)
llm_load_tensors: offloading 12 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 13/13 layers to GPU
llm_load_tensors:        CPU buffer size =    44.72 MiB
llm_load_tensors:      Metal buffer size =   260.87 MiB
.......................................................
llama_new_context_with_model: n_ctx      = 8192
llama_new_context_with_model: n_batch    = 512
llama_new_context_with_model: n_ubatch   = 512
llama_new_context_with_model: freq_base  = 1000.0
llama_new_context_with_model: freq_scale = 1
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2 Pro
ggml_metal_init: picking default device: Apple M2 Pro
ggml_metal_init: using embedded metal library
ggml_metal_init: GPU name:   Apple M2 Pro
ggml_metal_init: GPU family: MTLGPUFamilyApple8  (1008)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 22906.50 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   288.00 MiB, (  550.75 / 21845.34)
llama_kv_cache_init:      Metal KV buffer size =   288.00 MiB
llama_new_context_with_model: KV self size  =  288.00 MiB, K (f16):  144.00 MiB, V (f16):  144.00 MiB
llama_new_context_with_model:        CPU  output buffer size =     0.00 MiB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =    23.02 MiB, (  573.77 / 21845.34)
llama_new_context_with_model:      Metal compute buffer size =    23.00 MiB
llama_new_context_with_model:        CPU compute buffer size =     3.50 MiB
llama_new_context_with_model: graph nodes  = 453
llama_new_context_with_model: graph splits = 2
DEBUG [initialize] initializing slots | n_slots=1 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [initialize] new slot | n_ctx_slot=8192 slot_id=0 tid="0x1fe8b0c00" timestamp=1720137790
INFO [main] model loaded | tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] all slots are idle and system prompt is empty, clear the KV cache | tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=0 tid="0x1fe8b0c00" timestamp=1720137790
time=2024-07-04T18:03:10.227-06:00 level=INFO source=server.go:545 msg="llama runner started in 0.25 seconds"
time=2024-07-04T18:03:10.227-06:00 level=DEBUG source=sched.go:351 msg="finished setting up runner" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=1 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] slot released | n_cache_tokens=1 n_ctx=8192 n_past=1 n_system_tokens=0 slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790 truncated=false
DEBUG [log_server_request] request | method="POST" params={} path="/embedding" remote_addr="127.0.0.1" remote_port=51137 status=200 tid="0x16af3b000" timestamp=1720137790
[GIN] 2024/07/04 - 18:03:10 | 200 |  411.384417ms |       127.0.0.1 | POST     "/api/embeddings"
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:355 msg="context for request finished"
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:237 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 duration=5m0s
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:255 msg="after processing request finished event" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 refCount=0
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:129 msg="max runners achieved, unloading one to make room" runner_count=1
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:602 msg="found an idle runner to unload"
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:181 msg="resetting model to expire immediately to make room" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 refCount=0
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:194 msg="waiting for pending requests to complete and unload to occur" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:258 msg="runner expired event received" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:274 msg="got lock to unload" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=server.go:954 msg="stopping llama server"
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=server.go:960 msg="waiting for llama server to exit"
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=server.go:964 msg="llama server stopped"
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:279 msg="runner released" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:283 msg="sending an unloaded event" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:200 msg="unload completed" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
[GIN] 2024/07/04 - 18:03:10 | 500 |    5.307625ms |       127.0.0.1 | POST     "/api/chat"

Smart Second Brain Version

1.3.0

Debug Info

DevTools failed to load source map: Could not load content for file:///home/runner/work/obsidian-Smart2Brain/obsidian-Smart2Brain/build/smart-second-brain/main.js.map: Unexpected end of JSON input

Assistant: Failed to run Smart Second Brain (Error: ,Error: Expected a Runnable, function or object. Instead got an unsupported type.,). Please retry.

@stefnestor stefnestor added the bug Something isn't working label Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants