autocomplete: tune autocorrect algorithm.

Tune autocorrect to produce fewer false positives. Before we used trigram similarity. Now we use Levenshtein edit distance with a dynamic typo threshold. Trigram similarity was able to correct large transpositions (e.g. `miku_hatsune` -> `hatsune_miku`), but it was bad at correcting small typos. Levenshtein is good at small typos, but can't correct large transpositions.
2020-12-13 00:45:22 -06:00
parent 119268e118
commit 6a46aeb55c
5 changed files with 31 additions and 7 deletions
--- a/db/structure.sql
+++ b/db/structure.sql
@@ -9,6 +9,21 @@ SET xmloption = content;
 SET client_min_messages = warning;
 SET row_security = off;

+
+--
+-- Name: fuzzystrmatch; Type: EXTENSION; Schema: -; Owner: -
+--
+
+CREATE EXTENSION IF NOT EXISTS fuzzystrmatch WITH SCHEMA public;
+
+
+--
+-- Name: EXTENSION fuzzystrmatch; Type: COMMENT; Schema: -; Owner: -
+--
+
+COMMENT ON EXTENSION fuzzystrmatch IS 'determine similarities and distance between strings';
+
+
 --
 -- Name: pg_trgm; Type: EXTENSION; Schema: -; Owner: -
 --
@@ -7420,6 +7435,7 @@ INSERT INTO "schema_migrations" (version) VALUES
 ('20200520060951'),
 ('20200803022359'),
 ('20200816175151'),
-('20201201211748');
+('20201201211748'),
+('20201213052805');