-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
could it support chinese? #21
Comments
I've created this branch: https://github.com/matthewfranglen/postgres-elasticsearch-fdw/tree/chinese which has a test for inserting Chinese. It passes. There seems to be a problem with your system configuration for this as it is attempting to decode the string as ascii:
At the moment I don't really have a solid idea why your system is defaulting to decoding the string as ascii. I suggest looking at how the docker images differ to your system. |
I think you should look into your locale settings. See locale.getdefaultlocale which may be the cause of your issue. Inside the postgres docker container the |
I find the problem is not because i used chinese,it is couse by useing bytea type. |
What's the table definition? |
CREATE EXTENSION multicorn;
CREATE SERVER multicorn_es FOREIGN DATA WRAPPER multicorn
OPTIONS (
wrapper 'pg_es_fdw.ElasticsearchFDW'
);
DROP FOREIGN TABLE IF EXISTS "public"."test_es";
CREATE FOREIGN TABLE test_es (
"id" int8 NOT NULL,
"num" int4,
"flag" bit(3),
"istrue" bool,
"createdate" date,
"price" numeric(5,2),
"price2" float8,
"datetime" timestamp(6),
"address" varchar(255) COLLATE "pg_catalog"."default",
"message" text COLLATE "pg_catalog"."default",
"testbytea" bytea,
"testjson" json,
"time" time(6)
)
SERVER multicorn_es
OPTIONS(
host '10.10.0.160',
port '9200',
index 'test_es',
type 'doc',
rowid_column 'id',
query_column 'query',
query_dsl 'false',
-- score_column 'message',
-- default_sort 'last_updated:desc',
-- sort_column 'id',
refresh 'false',
complete_returning 'false',
timeout '20',
username 'elastic',
password 'changeme'
)
; INSERT INTO "public"."test_es" VALUES (1, 666, '111', 't', '2021-03-05', 11.01, 66.666, '2021-03-05 12:22:54', 'asdfa', 'adsfaaaa', '中文测试'::bytea, '{}', '11:27:00'); |
The elasticsearch schema would be great if you have that as well. |
The elasticsearch version is 6.8.6,the index is created by default when i insert data {
"state": "open",
"settings": {
"index": {
"creation_date": "1614943999235",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "30jZAEgwQLunvvH8A1FzGA",
"version": {
"created": "6080699"
},
"provided_name": "test_es"
}
},
"mappings": {
"doc": {
"properties": {
"address": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"flag": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"num": {
"type": "long"
},
"createdate": {
"type": "date"
},
"message": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"istrue": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"datetime": {
"type": "date"
},
"testbytea": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"price666": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"testjson": {
"type": "object"
},
"price": {
"type": "float"
},
"time": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"price2": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
}
},
"aliases": [],
"primary_terms": {
"0": 1,
"1": 1,
"2": 1,
"3": 1,
"4": 1
},
"in_sync_allocations": {
"0": [
"7UsFkimCRji0kviOgKk7gg",
"srAnowRfQBSmlzSTGge6bA"
],
"1": [
"OhjG7VQ2Tm2l5VKU3c7IZw",
"w_y52WfjTuGU-p7unXTQ6w"
],
"2": [
"rpfDNJheRjSdLpg-ZYkxMg",
"YT1fVCEZT1mrdNyFrqnSbg"
],
"3": [
"8OCOll5qQjyyJ11pgyczzw",
"Osh_2xoQRgm_zbhNArTTvQ"
],
"4": [
"hsy1xN8aRDqBNhex3kjqNA",
"yzp3uT-OSlWPRFdvIhCbKA"
]
}
} |
That's great, thank you. |
I wonder if the problem is this field definition: "testbytea": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
} It looks like it should have a type of I'll see what I can do about fixing this on the postgres side. |
I change the es index type to binary,but the problem still exsist,maybe it is caused by php conventer. |
The problem is in the PG FDW. Elasticsearch expects the data to be base64 encoded. I need to add a data conversion layer for the more complex types, which I've been putting off. Doing that should also allow me to address the GeoField ticket. Sorry about the delay. |
Thanks for help,I find a way to solve the problem.
So I changed the system encoding by create a sitecustomize.py in python Lib\site-packages:
reboot the system,it works! |
when i try to insert into chinese,i got zhe flowing error:
The text was updated successfully, but these errors were encountered: