Membangun rantai kata yang panjang

17

Tantangan ini adalah untuk menemukan rantai kata-kata bahasa Inggris terpanjang di mana 3 karakter pertama dari kata berikutnya cocok dengan 3 karakter terakhir dari kata terakhir. Anda akan menggunakan kamus umum yang tersedia di distribusi Linux yang dapat diunduh di sini:

https://www.dropbox.com/s/8tyzf94ps37tzp7/words?dl=0

yang memiliki 99171 kata bahasa Inggris. Jika Linux lokal Anda /usr/share/dict/wordsadalah file yang sama (memiliki md5sum == cbbcded3dc3b61ad50c4e36f79c37084), Anda dapat menggunakannya.

Kata-kata hanya dapat digunakan satu kali dalam jawaban.

EDIT: Surat harus cocok persis termasuk huruf besar / kecil, apostrof, dan aksen.

Contoh dari jawaban yang valid adalah: idea deadpan panoramic micra craftsman mantra traffic fiche yang akan skor 8.

Jawaban dengan rantai kata terpanjang yang valid akan menjadi pemenang. Dalam hal seri, jawaban paling awal akan menang. Jawaban Anda harus mencantumkan rantai kata-kata yang Anda temukan dan (tentu saja) program yang Anda tulis untuk melakukannya.

Ksatria Logika
sumber
Apa cara yang diizinkan untuk memuat data ini?
Hacketo
Anda harus mengunduh daftar kata dari tautan dropbox di atas, lalu memprosesnya dengan program Anda untuk menghasilkan jawaban terbaik yang Anda bisa.
Logic Knight
Apakah aksen penting? Juga, bagaimana dengan kata-kata yang lebih pendek dari tiga huruf?
KSFT
Kata-kata yang lebih pendek dari 3 huruf tidak dapat memenuhi aturan pencocokan 3 huruf, jadi tidak termasuk. Huruf harus cocok termasuk aksen dan huruf besar / kecil.
Logic Knight
1
Analisis saya terhadap grafik adalah bahwa hanya ada satu komponen yang terhubung sangat non-sepele, dan jalur terpanjang dalam grafik terkompresi adalah panjang 6. Dengan mempertimbangkan minimum untuk setiap urutan tiga huruf kata dalam SCC dengan urutan sebagai awalan dan akhiran, saya mendapatkan batas atas 4655 pada rantai kata terpanjang, jadi mungkin masih ada banyak ruang untuk diperbaiki pada yang terbaik saat ini tahun 1733.
Peter Taylor

Jawaban:

9

Java, heuristik mendukung vertex yang menginduksi grafik terbesar: 1825 1855 1873

Kode di bawah ini berjalan dalam waktu kurang dari 10 menit dan menemukan jalur berikut:

[wad, wadis, dis, dismay, may, mayfly, flywheels, elsewhere, erecting, ingratiate, ateliers, ersatzes, zest, esthetic, tickled, ledger, germicide, idealizes, zestful, fulling, ingrains, institute, uterine, ineptness, essaying, ingeniously, slyness, essences, cessations, onshore, ores, resoundingly, glycerine, inertness, essay, say, saying, ingenuous, ousted, tediously, sly, slyest, estrogen, genuflecting, ingestion, ionizer, zeros, roses, sesames, mes, meshed, hedonist, isthmuses, sesame, amending, ingredient, entrapment, enthuses, session, ionosphere, erectness, essayist, isthmus, mustaches, hesitatingly, glycogen, generation, ions, onset, settable, blew, lewder, deriding, ingratiates, testicles, lessen, sensitization, ionization, ionizing, ingratiating, ingenious, ouster, terrorizing, ingest, estranges, gesticulating, ingrates, testis, tissue, sue, suede, edelweiss, issuing, ingraining, ingrown, owner, nerdiest, estimation, ionospheres, rescue, cue, cueing, ingesting, ingot, got, gotten, tensor, sorrowing, ingratiated, tedious, ousting, ingratiatingly, glycerin, ringside, identifiable, bleariest, ester, terminological, calibrator, torrent, entraps, apse, pseudonym, nymphomania, niacin, cinema, emailed, led, ledges, gesticulates, testicle, clement, entail, ail, ailment, enter, terrains, inspires, restaurateur, euros, rosiest, estimates, tester, termite, iterator, torture, urethras, raspiest, estimator, tore, oregano, anointment, enthuse, useful, fulfil, filmstrip, riposte, stereotyped, pedicure, urea, readmits, itself, elf, elfin, finagles, lesbians, answerable, bleat, eatery, erythrocytes, testosterone, one, ones, nest, esteemed, medicine, inextricable, blessed, sediment, entry, try, tryout, outsources, cesarians, answered, redressed, seducer, cervical, calumniates, test, establishment, entombment, enthusiastic, tickles, lessens, ensemble, blemishes, hesitant, antic, tick, ickiest, estimable, blemished, hedgehog, hogan, gantlet, letdown, own, ownership, hippest, estates, testates, testiest, establishes, hes, hesitates, testable, bleakest, esthetes, testament, entice, iceberg, erg, ergonomic, microscope, operatives, vestibules, lesser, serenade, adenoidal, dales, lest, estrangement, entrap, raptures, resourceful, fulsome, omen, menswear, earthliest, established, hedges, gestates, testy, styes, yeshivot, voter, terrible, blender, derides, descent, enticed, cedillas, lass, assailable, bleacher, hermit, mite, item, temperas, rash, ashtray, rayon, yonder, dermis, mismanage, agendas, dash, ashy, shy, shyster, terrapins, insatiable, bleeder, derives, vestment, entangle, glen, lengthens, ensconced, ceded, deduced, cedars, arsenic, nice, ice, iced, cedar, daredevil, villa, llamas, masseuse, use, useable, bleach, achievable, bleached, hedonistic, tic, ticker, kerchieves, vessel, sell, ell, elliptic, ticket, kettles, lessee, seeps, epsilon, longboat, oath, atherosclerosis, sisterhood, oodles, lesson, sonatas, tassel, selvage, age, agent, entranced, cedes, descender, deranges, gestures, restraint, interment, enthused, seduced, cedilla, llama, amalgam, gamut, mutable, blend, endear, earthy, thymus, mussel, seltzer, zero, erodes, despot, potful, fulfillment, enthrall, allot, lotus, tussle, sledgehammered, redolent, entrapped, pedestal, talk, alkalis, listen, tended, deductible, bleeped, pedigree, reentered, redistribute, uterus, rustproofed, fed, fedora, oranges, gesundheit, either, herdsman, manes, nestles, lessor, sorrowful, fullback, acknowledges, gestured, redoubtable, blended, deduces, cesareans, answer, werewolves, vesper, perseveres, restructures, reside, ideogram, rammed, meddlesome, omens, ensign, ignores, restrains, insolent, entanglement, entrenchment, enticement, entomological, calligraphy, physical, calico, iconoclast, astringent, entertainment, entrant, antennas, nasty, stymie, miens, enslave, averred, redefine, inexorable, blenched, hedgerow, rowboat, oat, oaten, tend, endears, arson, songwriter, terminable, blent, entreaty, atypical, calypso, psoriasis, sister, term, ermine, ineligible, bleaker, kerosene, enema, emancipator, tormentor, torrider, derailment, entertains, instil, tildes, destine, inelegant, anthropomorphic, hiccup, cupolas, lastingly, glycerol, rollback, acknowledgment, entombed, bedridden, denser, servicewomen, menopause, used, sedatives, vesicle, clearinghouse, user, servant, antipodes, descry, crystalline, inexpedient, enthusiast, astonishment, entirety, etymological, calendared, redbreast, astronomer, merinos, nosedove, overpay, pay, paymaster, termagant, antiaircraft, aftercare, ares, resentful, fulcrum, rumpus, pushcart, artiste, stethoscopes, pesetas, taste, steadfast, astride, ides, destitute, utensil, silvan, vanguard, ardent, entryway, waysides, despair, airdrop, ropes, pestered, redder, derangement, entered, redeemed, medullas, lasagnas, nasal, salsas, sashay, hay, haymow, mow, mowed, wedder, derringer, germane, anemic, microfilmed, media, diatom, tomboys, oyster, terminator, toreador, dorsal, salespeople, pleased, sedater, terabit, bitten, tentacle, clergyman, manifesto, stomach, achoo, hoopla, plaza, azalea, leaven, vendor, dormant, antiparticle, cleared, redraft, afterword, ordains, insufficient, entitlement, entomb, ombudsmen, men, mental, tallyhos, hospice, icecap, cape, aperitif, tiffed, fedoras, rasped, pediatric, rickshaw, hawker, keratin, tinctures, reset, setback, acknowledgement, enthronement, entwine, inexact, actor, torpedos, dosed, sedan, dancer, cerebrum, rumple, plea, leach, ache, cheaper, per, periscopes, pestilent, entreat, eater, terser, serape, ape, apes, pesky, skycap, capped, pederast, astuter, terrace, acetaminophen, henchmen, menopausal, saltcellar, lard, ardor, dormice, icebound, underbrush, ushered, redrew, rewound, underclass, assassin, sinew, newscast, astrologer, gerund, undertaken, ken, kens, ensnared, redcap, cappuccinos, nostrum, rum, rumored, redraw, rawhide, identical, calcine, inertia, tiara, arabesque, queerer, reruns, unsold, oldie, diesel, selectmen, mentored, redden, dental, talon, longhand, and, androgen, genome, omelet, lethal, hallucinogenic, nickname, amen, menhaden, denudes, despaired, redevelop, lope, operas, rasp, aspired, redskin, kindergartens, ensnares, resultant, anthropological, callus, lustful, fulcra, crammed, mediocre, crepes, pesticide, ideas, eastbound, under, derrières, respired, rediscovered, redundant, antihero, erode, ode, odes, described, bedevil, villager, gerrymander, deride, ideograph, aphid, hid, hides, describes, besides, despoil, oilskin, kingdom, dominant, ant, antipasti, stiffens, ensured, redeemer, merchant, antiwar, warped, pederasty, stylus, lush, usher, her, hereafter, terrapin, pinnacle, clerical, caliber, bereave, avenger, geriatric, rickshas, haste, stereoscopes, pester, termini, initiator, tortures, restorer, reran, ransomed, medulla, llanos, nostril, rill, illogical, calif, lifer, fervor, vortex, textures, resister, termed, medieval, valor, lord, ordered, rediscover, verbatim, times, mesdames, mescal, caliper, periscope, opera, erasures, restart, artichokes, kestrel, reliant, antebellum, lumbago, agog, goggle, gleeful, fulfill, illustrator, tor, torque, questionnaires, resumed, mediator, tort, orthodoxy, oxymora, oratorio, riot, iotas, taster, terrific, fiche, checkpoint, interloper, perfumes, mesas, sassafras, rasher, heraldry, drywall, all, allergens, ensnare, area, rearm, armchair, airman, manufactures, resurface, acerbic, bicycle, cleverer, rerun, runt, untidy, idyllic, lichens, ensures, resend, endemic, microchip, hippopotamus, muscatel, telecast, astronaut, autopilot, lot, loth, other, heros, rosin, single, gleamed, mediaeval, valet, lettered, redound, underside, ideological, calliper, perihelia, liaison, sonic, nicknames, messenger, germicides, descendant, antigen, genital, tall, allergen, gentleman, mangos, gossipped, pedicures, resistant, antlered, redeveloped, pedagogical, calligrapher, heroins, inside, idea, deafen, fen, fencer, cerebra, bravuras, rascal, calculus, lusher, herbivores, resins, instill, illicit, citric, ricochet, heterodoxy, oxygen, generic, rice, icebox, box, boxcar, cartography, physique, quell, ellipsis, sis, sisal, sallow, lowbrow, rowel, well, elliptical, calf, alfresco, scow, cow, cowboy, boy, boyfriend, end, endeared, red, redesign, ignoramus, musket, kettledrum, rump, umped, pedlar, larvas, vassal, salmonellas, last, astronomical, calfskin, kingfisher, hereupon, ponchos, hospital, talisman, mantel, telethon, honcho, chomped, pedant, antitoxins, instant, antipastos, tossup, superintend, endangered, redskins, instigator, torpor, portico, icon, conquistador, dormer, merganser, seraphic, hiccuped, pedagogue, guerrillas, laser, sera, eraser, seraph, aphasic, sickbed, bed, bedsores, resign, ignorant, anthropocentric, richer, herdsmen, menu, enures, resuscitator, tornado, ado, adobe, obeisant, anthill, illegal, gallon, longshoremen, menace, ace, acetylene, enemas, mas, mascot, cot, cotton, tonsures, restores, result, ultraviolet, letterbox, boxer, xerography, physiological, calmer, merchantmen, mentor, torus, russet, settee, teenager, gerbil, billfold, old, olden, denatures, resubmit, mitten, ten, tenon, nonchalant, antique, queasy, asymmetric, ricksha, shanghai, haircut, cutups, upsides, descriptor, torpid, pidgin, gins, instep, tepee, peeper, perturb, urbane, anemia, miasmas, mascaras, raspy, spy, spyglass, assures, resonator, tortilla, llano, anon, nontechnical, calabash, ashram, rampart, arthropod, podia, diagram, ramp, amp, amphitheatres, resistor, tortillas, lasagna, gnat, natal, talc, alcoholic, licensee, seemed, medical, calm, almanac, nacho, choreography, phylum, lumbar, barman, mannequins, insures, respires, resound, underarm, armatures, resides, desideratum, tumult, ultrasound, underdog, dogcatcher, herald, alderwoman, mandarins, insecticides, desires, respirator, torrid, rid, rides, descant, anticlimax, maximum, mum, mummer, meringue, guesser, sermon, monogram, ramrod, rodeo, deodorant, antelopes, peso, esophagus, gusset, setups, upshot, hotel, telescope, open, penicillin, lingos, gossip, sip, siphon, honor, normal, maltreat, eaten, tenet, nether, herpes, pesticides, descend, endow, downfall, alleyway, way, waylay, layman, manicures, reshuffle, flea, lea, leash, ashen, henchman, mandolin, linchpins, inscribes, bestow, townspeople, plectrum, rumbas, baste, sternum, numb, umbilici, icicle, cleaver, vertebra, brains, insouciant, antidepressant, anthem, hemoglobin, binocular, largos, gossamer, mermaid, aid, aides, desperado, adopt, opt, optima, imam, mambos, bosun, sun, sunspot, potpourris, risky, sky, skyscraper, perturbed, bedraggle, glee, lee, leech, echo, choreographer, heraldic, dictum, tumid, midday, day, daybed, bedsides, desktop, topknot, notepaper, periodical, calendar, dare, areas, easel, selfsame, amebas, basins, ins, insulin, linnet, nettlesome, omegas, gasp, aspartame, amend, endures, researcher, herbal, balsas, sass, assault, ultimatum, tumor, mortgagor, gores, resort, orthopaedic, dictatorship, hipper, person, sonar, narc, arc, archduke, ukelele, elegant, anther, hereabout, outfox, fox, foxtrot, rotogravures, restaurant, antechamber, beret, retriever, verbena, enamor, morsel, sellout, outmaneuver, vertical, call, allergenic, niche, chessman, mandolins, insipid, pidgins, install, allures, rescind, indignant, antithetical, calicos, cosmonaut, auto, utopia, piano, another, heretical, calk, alkali, alibi, ibis, bistro, troupe, upend, endorser, serviceman, mandarin, rind, inductee, teepee, pee, peekaboo, bootstrap, rape, apertures, resin, singular, larval, valiant, antiperspirant, antipasto, stop, topical, calisthenic, nicer, cervix, vixen, xenophobic, bicep, cephalic, licit, citizenship, hippopotami, amigos, gospel, pellet, letups, upstart, artificer, cerebellum, lumberman, manic, nicknamed, medic, dickie, kielbasas, sash, ash, ashcan, cannon, nonskid, kid, kidnaper, perjures, resolver, vernacular, larkspur, puree, reefer, ferret, retains, insofar, far, fart, artisan, sandbag, bagel, gelatin, tinsel, selectman, manacle, clever, versus, sustains, inscribed, bedpan, pandemic, microprocessor, sorbet, bet, betcha, char, harem, remodel, deli, elicit, citadel, deliver, verandas, dashikis, kisser, servicemen, menthol, holiday, daydreamer, merchantman, manikins, insane, anew, newsprint, interwove, overreach, achieve, even, venom, nomad, mad, madrigal, gala, alarm, armpit, pitchman, manor, northbound, underbid, bidet, detox, toxemia, miasma, smarten, tenderloins, insult, ultra, travel, velvet, veteran, random, domino, inopportune, uneconomic, microbes, bestir, tiro, ironware, arena, enamel, melodramas, mastodon, don, donut, nut, nutmeg, meg, megalopolis, lissom, sombre, breathe, therefrom, romper, performer, merman, mangrove, overshadow, downcast, astir, tiros, rostra, trachea, heaven, ventricle, clergywoman, maneuver, verbal, ballad, ladyship, hippie, pie, piebald, alderman, manatee, teethe, thereupon, poncho, choicer, ceramic, microscopic, picayune, uneaten, tendon, donor, northeast, astound, underpass, assessor, sorghum, hum, human, mantra, trainee, needlepoint, interplay, laywoman, mannikin, kinsman, mantillas, lassie, sieve, ever, verdigris, risen, sensor, sorrel, relabel, belabor, borsch, schlep, leprechauns, unsnap, nap, napkin, kin, kingpin, pinkeye, eyeglass, assemblyman, manikin, kingship, hip, hippos, postpartum, tumbrel, relic, lichee, heehaw, haw, hawser, servicewoman, many, anyhow, howsoever, vertex, text, extra, trap, rap, rapper, periwig, wigwag, wag, wagon, gonorrhea, heave, aver, vermin, minesweeper, perplex, lexicon, congas, gastronomic, microfiche, cheapen, pentathlon, longhair, air, aircraft, aft, aftertaste, stem, tempos, postwar, war, wart, article, clear, earshot, hotshot, hotbed, bedlam, lam, lambkin, kindergarten, tenser, serum, rumor, mortar, tarmac, macabre, breech, echos, hostel, telescopic, pickerel, relay, laypeople, pleas, east, astronomic, micra, crackpot, pot, potato, atom, tombed, bedbug, bugaboo, bootleg, leg, legato, atop, topple, plethora, orangutang, angora, orangutan, tan, tandem, democrat, rat, rattan, tang, angry, gryphon, honeybee, bee, beeswax, waxen, xenon, nonplus, lustre, trellis, lisle, sleepwear, earwig, wig, wigwam, wampum, pummel, melanomas, massacre, cretin, tin, tint, interviewee, wee, weeper, persimmon, monsignori, origin, gingham, ham, hamper, pericardia, diarrhea, heartthrob, rob, robes, besom, sombreros, rosebud, bud, budgerigar, garret, retrodden, denim, nimbus, bus, bushel, helmet, metaphor, horsefly, flypaper, peritonea, near, ear, earlobes, bestowal, wall, allay, layout, outlast, astrakhan, handicapper, perusal, saltpetre, tremor, moribund, undercut, cut, cutoff, off, offal, falcon, con, consul, sultan, tannin, ninepin, pinball, allegro, grommet, metro, trot, rot, rotten, tenpin, pineapple, plectra, transit, sitar, tar, taro, arousal, salmon, moneybag, bagpipe, ipecac, cache, checkout, outrun, runaround, undersea, sea, sear, earache, cherub, rub, rubicund, underpin, pin, pint, intagli, glib, lib, libel, bellyache, cherubim, bimbos, bosuns, unsound, undertow, tow, towel, wellington, ton, tonsil, silicon, concoct, octet, tetrahedra, drachmae, maestri, tripos, possum, sum, sumac, macro, crocus, custom, tom, tomcat, catsup, sup, superstar, tarpaulin, linchpin, pinpoint, intercom, comet, met, metacarpus, pussycat, catastrophe, phenomenon, nonverbal, ballpoint, interurban, bani, animal, malt, altar, tartar, tarot, rotund, undergrad, radio, diocesan, sandbar, bar, barren, renewal, walkout, outstrip, ripen, pen, pencil, cilantro, trout, outran, rancor, corncob, cob, cobra, bra, brag, rag, ragas, gas, gasohol, holdout, output, put, putsch, schwas, was, waste, stereo, reoccur, cur, curb, urban, ban, bantam, tam, tamp, ampul, pullout, outwit, wit, withal, halo, alohas, hasp, asparagus, gusto, stove, overlap, lapel, pelvis, visit, sit, sitcom, compendia, diadem, demigod, god, goddam, dam, dampen, pennon, non, noncom, compel, pelican, cancan, can, cancel, celesta, starlit, lit, litmus, muscat, cat, catnap, naphtha, than, handcar, carpel, pellagra, grammar, mar, mariachi, chichi, chi, chimp, imp, impel, pelvic, vicar, car, caribou, bourgeoisie, siesta, stab, tab, tabu, abut, but, butterfat, fat, fathom, homespun, pun, puns, unsheathe, the, theorem, remove, overtax, tax, taxicab, cab, cabal, balsam, sambas, basal, salamis, missal, salt, altho, tho, thou, housebound, underground, underclassman, man, mannikins, insectivores, resonant, antelope, operator, torn, ornamental, tallow, low, lowered, reddens, enshrine, inefficient, entertainer, nerves, vestiges, gesturing, ingested, tediousness, essentials]

Ide inti

Dalam Perkiraan jalur dan siklus terarah terpanjang , Bjorklund, Husfeldt, dan Khanna, Lecture Notes in Computer Science (2004), 222-233, penulis mengusulkan bahwa dalam grafik expander yang jarang, jalur panjang dapat ditemukan oleh pencarian serakah yang memilih masing-masing langkah tetangga dari ekor saat ini dari jalur yang membentang subgraf terbesar di G ', di mana G' adalah grafik asli dengan simpul di jalur saat ini dihapus. Saya tidak yakin dengan cara yang baik untuk menguji apakah suatu grafik adalah grafik expander, tapi kami jelas berhadapan dengan grafik yang jarang, dan karena intinya adalah sekitar 20.000 simpul dan memiliki diameter hanya 15, ia harus memiliki yang baik properti ekspansi. Jadi saya mengadopsi heuristik serakah ini.

Diberikan grafik G(V, E), kita dapat menemukan berapa banyak simpul yang bisa dijangkau dari setiap simpul menggunakan Floyd-Warshall dalam Theta(V^3)waktu, atau menggunakan algoritma Johnson dalam Theta(V^2 lg V + VE)waktu. Namun, saya tahu bahwa kita sedang berhadapan dengan grafik yang memiliki komponen terhubung sangat besar (SCC), jadi saya mengambil pendekatan yang berbeda. Jika kami mengidentifikasi SCC menggunakan algoritma Tarjan maka kami juga mendapatkan semacam topologi untuk grafik terkompresi G_c(V_c, E_c), yang akan jauh lebih kecil, pada O(E)waktunya. Karena G_cini adalah DAG, kita dapat menghitung keterjangkauan G_cdalam O(V_c^2 + E_c)waktu. (Saya kemudian menemukan bahwa ini mengisyaratkan dalam latihan 26-2.8 dari CLR ).

Karena faktor dominan dalam waktu berjalan adalah E, saya mengoptimalkannya dengan menyisipkan simpul dummy untuk awalan / sufiks. Jadi, alih-alih 151 * 64 = 9664 tepi dari kata-kata yang berakhir -res ke kata-kata mulai res- Saya memiliki 151 tepi dari kata-kata yang berakhir -res ke # res # dan 64 tepi dari # res # untuk kata-kata yang dimulai res- .

Dan akhirnya, karena setiap pencarian memakan waktu sekitar 4 menit pada PC lama saya, saya mencoba untuk menggabungkan hasil dengan jalur panjang sebelumnya. Ini jauh lebih cepat, dan muncul solusi terbaik saya saat ini.

Kode

org/cheddarmonk/math/graph/Graph.java:

package org.cheddarmonk.math.graph;

import java.util.Set;

public interface Graph<V> {
    public Set<V> getAdjacent(V node);
    public double getWeight(V from, V to);
}

org/cheddarmonk/math/graph/MutableGraph.java:

package org.cheddarmonk.math.graph;

import java.util.*;

public class MutableGraph<V> implements Graph<V> {
    private Map<V, Map<V, Double>> edgesBySource = new HashMap<V, Map<V, Double>>();

    public void addEdge(V from, V to, double weight) {
        if (!edgesBySource.containsKey(to)) edgesBySource.put(to, new HashMap<V, Double>());
        Map<V, Double> e = edgesBySource.get(from);
        if (e == null) edgesBySource.put(from, e = new HashMap<V, Double>());
        if (e.containsKey(to)) throw new IllegalArgumentException("There is already an edge between the vertices");
        e.put(to, weight);
    }

    @Override
    public Set<V> getAdjacent(V node) {
        Map<V, Double> e = edgesBySource.get(node);
        if (e == null) throw new IllegalArgumentException("node doesn't appear to be in the graph");
        return Collections.unmodifiableSet(e.keySet());
    }

    @Override
    public double getWeight(V from, V to) {
        Map<V, Double> e = edgesBySource.get(from);
        if (e == null) throw new IllegalArgumentException("from doesn't appear to be in the graph");
        if (!edgesBySource.containsKey(to)) throw new IllegalArgumentException("to doesn't appear to be in the graph");

        Double c = e.get(to);
        return c == null ? 0 : c.doubleValue();
    }
}

org/cheddarmonk/math/graph/StronglyConnectedComponents.java:

package org.cheddarmonk.math.graph;

import java.util.*;

/**
* A helper class for finding the strongly connected components of a graph using Tarjan's algorithm.
* http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
*/
public class StronglyConnectedComponents<V> {
    private final Graph<V> g;
    private List<Set<V>> topologicallySortedSccs = new ArrayList<Set<V>>();

    private final LinkedList<V> S = new LinkedList<V>();
    private final Set<V> S2 = new HashSet<V>();
    private final Map<V, Integer> index = new HashMap<V, Integer>();
    private final Map<V, Integer> lowlink = new HashMap<V, Integer>();

    private StronglyConnectedComponents(Graph<V> g) {
        this.g = g;
    }

    private void strongConnect(V v) {
        int idx = index.size();
        index.put(v, idx);
        lowlink.put(v, idx);

        S.push(v);
        S2.add(v);

        for (V w : g.getAdjacent(v)) {
            if (!index.containsKey(w)) {
                strongConnect(w);
                if (lowlink.get(w) < lowlink.get(v)) {
                    lowlink.put(v, lowlink.get(w));
                }
            }
            else if (S2.contains(w)) {
                if (index.get(w) < lowlink.get(v)) {
                    lowlink.put(v, index.get(w));
                }
            }
        }

        if (lowlink.get(v).equals(index.get(v))) {
            Set<V> scc = new HashSet<V>();
            V w;
            do {
                w = S.pop();
                S2.remove(w);
                scc.add(w);
            } while (!w.equals(v));

            topologicallySortedSccs.add(scc);
        }
    }

    public static <V> List<Set<V>> analyse(Graph<V> g, Set<V> sources) {
        if (g == null) throw new IllegalArgumentException("g");

        StronglyConnectedComponents<V> scc = new StronglyConnectedComponents<V>(g);
        for (V v : sources) {
            if (!scc.index.containsKey(v)) {
                scc.strongConnect(v);
            }
        }

        return scc.topologicallySortedSccs;
    }
}

org/cheddarmonk/ppcg/PPCG.java:

package org.cheddarmonk.ppcg;

import java.io.*;
import java.util.*;
import org.cheddarmonk.math.graph.*;

public class PPCG44922 {
    private static final String path = "/usr/share/dict/words";
    private static Set<String> allWords;
    private static Graph<String> fullGraph;

    public static void main(String[] args) {
        loadGraph();

        Random rnd = new Random();
        rnd.setSeed(8104951619088972997L);
        List<String> a = search(rnd);
        rnd.setSeed(-265860022884114241L);
        List<String> b = search(rnd);
        List<String> chain = spliceChains(a, b);
        System.out.println(chain.size());
        System.out.println(chain);
    }

    private static List<String> search(Random rnd) {
        List<String> chain = new ArrayList<String>();
        chain.add(selectOptimalReachabilityCount(fullGraph, allWords, rnd));
        while (true) {
            String tail = chain.get(chain.size() - 1);
            FilteredGraph g = new FilteredGraph(chain);

            // We know that tail only has one successor, so skip ahead.
            Set<String> candidates = new HashSet<String>(fullGraph.getAdjacent(suffix(tail)));
            candidates.removeAll(chain);
            if (candidates.isEmpty()) break;

            chain.add(selectOptimalReachabilityCount(g, candidates, rnd));
        }

        Iterator<String> it = chain.iterator();
        while (it.hasNext()) {
            if (it.next().charAt(0) == '#') it.remove();
        }
        return chain;
    }

    private static List<String> spliceChains(List<String> a, List<String> b) {
        Set<String> intersect = new HashSet<String>(b);
        intersect.retainAll(a);
        if (intersect.isEmpty()) return null;

        // Splice the longest bits. To avoid cycles, we look for intersection points which have the same set of reached intersection points.
        // Thus to get from one to the next we can take either route without violating the unique occurrence of each element in the spliced chain.
        Set<String> left = new HashSet<String>();
        Set<String> right = new HashSet<String>();
        List<String> newChain = new ArrayList<String>();

        // NB We assume that either a(0) and b(0) are the same or neither is in intersect.
        // This is a safe assumption in practice because they're both "wad".
        int idxA = 0, idxB = 0, nextA = 0, nextB = 0;
        while (idxA < a.size()) {
            nextA++;
            while (nextA < a.size() && !intersect.contains(a.get(nextA))) nextA++;
            String tailA = nextA < a.size() ? a.get(nextA) : "";
            left.add(tailA);

            nextB++;
            while (nextB < b.size() && !intersect.contains(b.get(nextB))) nextB++;
            String tailB = nextB < b.size() ? b.get(nextB) : "";
            right.add(tailB);

            if (left.equals(right) && tailA.equals(tailB)) {
                // We take the longer of idxA to nextA-1 or idxB to nextB - 1.
                if (nextA - idxA > nextB - idxB) newChain.addAll(a.subList(idxA, nextA));
                else newChain.addAll(b.subList(idxB, nextB));

                idxA = nextA;
                idxB = nextB;
            }
        }

        if (new HashSet<String>(newChain).size() == newChain.size()) return newChain;
        throw new IllegalStateException();
    }

    private static void loadGraph() {
        Set<String> words = new HashSet<String>();
        Set<String> prefixes = new HashSet<String>();
        Set<String> suffixes = new HashSet<String>();
        try {
            BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(path), "UTF-8"));
            String line;
            while ((line = br.readLine()) != null) {
                if (line.length() >= 3) {
                    words.add(line);
                    prefixes.add(prefix(line));
                    suffixes.add(suffix(line));
                }
            }
            br.close();
        }
        catch (IOException ioe) {
            throw new RuntimeException(ioe);
        }

        // Filter down to a core with decent reachability.
        prefixes.retainAll(suffixes);
        MutableGraph<String> g = new MutableGraph<String>();
        Iterator<String> it = words.iterator();
        while (it.hasNext()) {
            String line = it.next();
            if (prefixes.contains(prefix(line)) && prefixes.contains(suffix(line))) {
                // In the interests of keeping the number of edges down, I insert fake vertices.
                g.addEdge(prefix(line), line, 1);
                g.addEdge(line, suffix(line), 1);
            }
            else it.remove();
        }

        fullGraph = g;
        allWords = Collections.unmodifiableSet(words);
    }

    private static String prefix(String word) {
        return "#" + word.substring(0, 3) + "#";
    }

    private static String suffix(String word) {
        return "#" + word.substring(word.length() - 3, word.length()) + "#";
    }

    private static <V> Map<V, Integer> reachabilityCount(Graph<V> g, Set<V> sources) {
        List<Set<V>> sccs = StronglyConnectedComponents.analyse(g, sources);
        int n = sccs.size();

        // Within a strongly connected component, each vertex can reach each other vertex.
        // Then we need to also take into account the other SCCs which they can reach.
        // We can exploit the fact that we already have a topological sort of the DAG of SCCs to do this efficiently.
        Map<V, Integer> index = new HashMap<V, Integer>();
        for (int i = 0; i < n; i++) {
            for (V v : sccs.get(i)) index.put(v, i);
        }

        BitSet[] reachableSccs = new BitSet[n];
        Map<V, Integer> reachabilityCounts = new HashMap<V, Integer>();
        for (int i = 0; i < n; i++) {
            Set<V> scc = sccs.get(i);
            reachableSccs[i] = new BitSet(n);
            reachableSccs[i].set(i);
            for (V v : scc) {
                for (V w : g.getAdjacent(v)) {
                    int j = index.get(w);
                    if (j < i) reachableSccs[i].or(reachableSccs[j]);
                }
            }

            int count = 0;
            for (int j = reachableSccs[i].nextSetBit(0); j >= 0; j = reachableSccs[i].nextSetBit(j+1)) {
                count += sccs.get(j).size();
            }
            for (V v : scc) {
                reachabilityCounts.put(v, count);
            }
        }

        return reachabilityCounts;
    }

    private static <V extends Comparable<? super V>> V selectOptimalReachabilityCount(Graph<V> g, Set<V> candidates, Random rnd) {
        Map<V, Integer> r = reachabilityCount(g, candidates);

        int max = 0;
        List<V> attaining = new ArrayList<V>();
        for (V candidate : candidates) {
            int score = r.get(candidate);
            if (score > max) {
                max = score;
                attaining.clear();
            }
            if (score == max) attaining.add(candidate);
        }

        return selectRandom(attaining, rnd);
    }

    private static <T extends Comparable<? super T>> T selectRandom(Collection<T> elts, Random rnd) {
        List<T> deterministic = new ArrayList<T>(elts);
        Collections.sort(deterministic);
        Collections.shuffle(deterministic, rnd);
        return deterministic.get(0);
    }

    private static class FilteredGraph implements Graph<String> {
        private final Set<String> filteredVertices;

        public FilteredGraph(Collection<String> filteredVertices) {
            this.filteredVertices = new HashSet<String>(filteredVertices);
        }

        @Override
        public Set<String> getAdjacent(String node) {
            if (filteredVertices.contains(node)) return Collections.emptySet();

            Set<String> adj = new HashSet<String>(fullGraph.getAdjacent(node));
            adj.removeAll(filteredVertices);
            return adj;
        }

        @Override
        public double getWeight(String from, String to) {
            throw new RuntimeException("Not used");
        }
    }
}
Peter Taylor
sumber
"mendirikan" hmm ..
Matius Roh
6

Ruby, 1701

"Del" -> "ersatz's"( urutan penuh )

Mencoba menemukan solusi optimal terbukti terlalu mahal dalam waktu. Jadi mengapa tidak memilih sampel acak, cache apa yang kita bisa dan berharap yang terbaik?

Pertama a Hashdibangun yang memetakan awalan ke dunia penuh dimulai dengan awalan itu (misalnya "the" => ["the", "them", "their", ...]). Kemudian untuk setiap kata dalam daftar metode sequenceini dipanggil. Ia mendapat kata-kata yang mungkin bisa diikuti dari Hash, mengambil sampel 100dan menyebut dirinya secara rekursif. Yang terpanjang diambil dan ditampilkan dengan bangga. Benih untuk RNG ( Random::DEFAULT) dan panjang urutan juga ditampilkan.

Saya harus menjalankan program beberapa kali untuk mendapatkan hasil yang baik. Hasil khusus ini dihasilkan dengan biji328678850348335483228838046308296635426328678850348335483228838046308296635426 .

Naskah

require "json"

def prefix(word); word[0, 3];  end
def suffix(word); word[-3, 3]; end

def sequence(word, prefixes, skip)
  if SUBS.key?(word) && (SUBS[word] - skip) == SUBS[word]
    return SUBS[word]
  end

  skip         += [word] 
  suffix        = suffix(word)
  possibilities = (prefixes[suffix] ||= []) - skip

  if possibilities.empty?
    return [word]
  else
    sequences = possibilities.sample(100).map do |possibility|
      sequence(possibility, prefixes, skip)
    end

    return SUBS[word] = [word] + sequences.max_by(&:size)
  end
end

def correct?(sequence)
  once_only = sequence.all? { |y| sequence.count(y) == 1 }
  following = sequence.each_cons(2).all? { |a,b| a[-3,3] == b[0,3] }

  return once_only && following
end

words = open("words.txt", "r", &:read).split.select { |word| word.size >= 3 }

SUBS     = {}
PREFIXES = {}

# Built a Hash that maps prefixes to an Array of words starting with the prefix.
words.each do |word|
  prefix = prefix(word)

  PREFIXES[prefix] ||= []
  PREFIXES[prefix] << word
end

longest = [1]

words.each do |word|
  PREFIXES[prefix(word)].delete(word)

  sequence = sequence(word, PREFIXES, [word])

  if sequence.size > longest.size
    longest = sequence
  end
end

puts longest.inspect
puts 
puts "Generated with seed: #{Random::DEFAULT.seed}"
puts "Length: #{longest.size}"
puts "Correct: #{correct?(longest)}"
britishtea
sumber
Saya tidak berpikir untuk secara acak mencicipi kemungkinan kata berikutnya! Saya menggunakan ide itu!
KSFT
Berapa lama ini berjalan?
KSFT
Saya tidak mengatur waktu, tetapi saya memperkirakan sekitar 15-20 menit (saya membiarkannya habis saat makan malam).
britishtea
Python saya masih membangun dict (hash) ...
KSFT
Membangun kamus harus cepat (satu iterasi melalui semua kata). Ruby melaporkan beberapa 0.0996detik.
britishtea
5

Skor: 1631 1662 kata

['Aachen', 'hen', 'henceforward', 'ardor', 'dorsal', 'salmon', 'monolog', 'log', 'logjam', 
'jam', 'jamb', 'ambassador', 'dormouse', 'useable', 'bleeding', 'ingenious', 'ouster', 
'terminable', 'bleakness', 'essay', 'say', 'saying', 'ingress', 'essences', 'cession', 
....
....
'ionosphere', 'ere', 'erecting', 'ingratiating', 'ingrate', 'ate', 'ateliers', "ersatz's"]

Anda dapat menemukan seluruh rangkaian di sini: http://pastebin.com/TfAvhP9X

Saya tidak memiliki kode sumber lengkap. Saya mencoba berbagai pendekatan. Tetapi di sini ada beberapa potongan kode, yang seharusnya dapat menghasilkan urutan dengan panjang yang sama. Maaf, itu tidak terlalu indah.

Kode (Python):

Pertama, beberapa preprocessing data.

from collections import defaultdict

with open('words') as f:
    words = [line.strip() for line in f]
words = [word for word in words if len(word)>=3 and word[-2:]!="'s"]

word_connections = defaultdict(list)
for word in words:
    word_connections[word[:3]].append(word)

Kemudian saya mendefinisikan fungsi rekursif (pencarian pertama Kedalaman).

global m
m=0
def find_chain(chain):
    s = set(word_connections[chain[-1][-3:]])-set(chain)
    if len(s)== 0:
        global m
        if len(chain) > m:
            m=len(chain)
            print(len(chain), chain)
    else:
        for w in s:
            if w not in chain:
                find_chain(chain + [w])

for word in words:
    find_chain([word])

Tentu saja ini terlalu lama. Tetapi setelah beberapa waktu ia menemukan urutan dengan 1090 elemen, dan saya berhenti.

Hal selanjutnya yang harus dilakukan, adalah pencarian lokal. Untuk setiap dua tetangga n1, n2 dalam urutan, saya mencoba mencari urutan mulai dari n1 dan berakhir di n2. Jika urutan seperti itu ada, saya masukkan.

def tryimpove(chain, length, startvalue=0):
    s=set(chain)
    for i in range(startvalue,len(chain)-1):
        print(i)
        for sub in sorted(short_sequences([chain[i]],length,chain[i+1]),key=len,reverse=True):

            if len(s & set(sub))==1:
                chain[i:i+1]=sub
                print(i,'improved:',len(chain))
                return tryimpove(chain,length,i)
    return chain

def short_sequences(chain,length,end):
    if 2 <= len(chain):
        if chain[-1][-3:]==end[:3]:
            yield chain
    if len(chain) < length:
        s = set(word_connections[chain[-1][-3:]])-set(chain)
        for w in s:
            for i in short_sequences(chain + [w],length,end):
                yield i

for m in range(5, 100):
    seq = tryimpove(seq,m)

Tentu saja saya juga harus menghentikan program secara manual.

Jakube
sumber
Berapa lama untuk mendapatkan hasil Anda?
Hacketo
Sekitar satu jam untuk menghasilkan urutan 1090, dan satu jam lagi untuk melakukan pencarian lokal.
Jakube
5

PHP, 1742 1795

Saya sudah mengacaukan PHP tentang hal ini. Triknya adalah menyisihkan daftar ke sekitar 20k yang sebenarnya valid, dan hanya membuang sisanya. Program saya melakukan ini secara iteratif (beberapa kata yang dibuang di iterasi pertama berarti yang lain tidak lagi valid) di awal.

Kode saya mengerikan, menggunakan sejumlah variabel global, menggunakan terlalu banyak memori (ia menyimpan salinan seluruh tabel awalan untuk setiap iterasi) dan butuh beberapa hari untuk menghasilkan yang terbaik saat ini, tetapi masih mengelola untuk menjadi pemenang - untuk saat ini. Ini dimulai dengan cukup cepat tetapi semakin lambat & lambat seiring berjalannya waktu.

<?php

  function show_list($list)
  {
      $st="";
      foreach ($list as $item => $blah)
      $st.="$item ";
      return rtrim($st);
  }

  function mysort($a,$b)
  {
      global $parts;
      $a_count=isset($parts[substr($a,-3,3)])?count($parts[substr($a,-3,3)]):0;
      $b_count=isset($parts[substr($b,-3,3)])?count($parts[substr($b,-3,3)]):0;
      return $b_count-$a_count;
  }  

  function recurse($line,$list,$parts)
  {
    global $lines; 
    global $max;
    global $finished;
    global $best;
    $end=substr($line,-3,3);
    $count=0;
    if ($finished)
        return;
    if (isset($parts[$end]))
    {
        $maxp=count($parts[$end])-1;
        if ($maxp<0)
            return;
        $randa=array();
        for ($a=1;$a<3;$a++)
        {
            $randa[mt_rand(0,$maxp)]=1;
        }
        $n=mt_rand(0,$maxp);

        foreach ($parts[$end] as $key => $endpart)
        {

            if (!isset($list[$endpart]))
            {
                $this_part=$parts[$end];
                unset($this_part[$key]);
                $new_parts=$parts;
                unset($new_parts[$end][$key]);
                $list[$endpart]=1;
                recurse($endpart,$list,$new_parts);
                unset($list[$endpart]);
            }
        }

    }
    $count=count($list);
    if ($count>$max)
    {
        //echo "current best: $count\n";
        file_put_contents('best.txt',show_list($list) . "\n",FILE_APPEND);
        $max=$count;
        $best=$list;
    }
  }

  function cull_lines(&$lines)
  {
      global $parts;
      do 
      {    
          $wordcount=count($lines);
          $parts=array();$end_parts=array();
          foreach ($lines as $line)
          {
              if (strlen($line)<3)
                continue;
              $parts[substr($line,0,3)][]=$line;
              if (strlen($line)>3)
                $end_parts[substr($line,-3,3)][]=$line;
          }
          foreach ($lines as $key => $line)
          {
              $end=substr($line,-3,3);
              if (strlen($line)<3 || !isset($parts[$end]) || !isset($end_parts[substr($line,0,3)] ) )
                unset($lines[$key]);
          }
          foreach ($parts as $part)
          {
            if (count($part)==1)
            {
                $source_part=mt_rand(0,count($part)-1);
                $this_part = substr($part[0],0,3);
                $this_min = 10000;
                $this_key = 0;
                if (strlen($part[$source_part])==3)
                {
                    foreach ($lines as $key => $line)
                    if ($line == $part[$source_part])
                    {
                            unset($lines[$key]);
                            break;
                    }
                }
                elseif (isset($end_parts[$this_part]))
                {
                    foreach ($end_parts[$this_part] as $key => $end_part)
                    {
                        if (isset($parts[substr($end_part,0,3)]))
                        {
                            $n=count($parts[substr($end_part,0,3)]);
                            if ($n<$this_min)
                            {
                                $this_min=$n;
                                $this_key=$key;    
                            }
                        }
                    }

                    foreach ($lines as $key => $line)
                    if ($line == $part[$source_part])
                    {
                            unset($lines[$key]);

                    }
                    elseif ($line == $end_parts[$this_part][$this_key])
                    {
                        $lines[$key].=' ' . $part[$source_part];
                    }
                }

            }
          }
          echo "$wordcount words left\n";
      }
      while ($wordcount!=count($lines));
  }

  ini_set('xdebug.max_nesting_level',10000);
  ini_set('memory_limit','1024M');
  $lines = explode("\n",file_get_contents('words.txt'));
  cull_lines($lines);
  $max=0;
  foreach ($parts as $key=>$part)
    usort($parts[$key],'mysort');    

  $n=count($lines);
  foreach ($lines as $rand => $blah)
  {
      if (mt_rand(0,$n)==1)
        break;
  }
  $rand=$lines[$rand];
  $line[$rand]=1;
  echo "Starting with $rand...\n";
  recurse($rand,$line,$parts);
  unset($line[$rand]);

?>

Peningkatan yang jelas akan menggunakan kata yatim untuk awal & akhir.

Bagaimanapun, saya benar-benar tidak tahu mengapa daftar pastebin saya dipindahkan ke komentar di sini, itu kembali sebagai tautan ke pastebin karena saya sekarang sudah memasukkan kode saya.

http://pastebin.com/Mzs0XwjV

awfullyawful
sumber
Saya memasukkan kode dari Pastebin dalam jawaban Anda, maka kami tidak harus bergantung pada domain eksternal; dalam kasus Pastebin turun, kami masih dapat melihat kode Anda di sini.
ProgramFOX
Yah, saya senang memiliki jawaban yang menang untuk sementara waktu setidaknya. Kerja bagus Peter Taylor ...
awfullyawful
Oh, sekarang saya melihat bahwa saya seharusnya tidak menambahkan daftar ke jawaban Anda; Saya agak bingung dan saya mencampuradukkan kode dan daftar kata, dan memasukkan daftar itu dalam jawaban Anda. Saya minta maaf untuk itu.
ProgramFOX
4

Python: 1702 - 1704 - 1733 kata

Saya menggunakan Dict untuk memetakan semua awalan ke semua kata, seperti

{
    "AOL" : [
        "AOL", "AOL's"
    ],...
    "oct" : [
         "octet","octets","octette's"
    ],...
}

Edit sedikit peningkatan: menghapus semua uselesskata di awal, jika sufiksnya tidak ada dalam daftar awalan (jelas akan menjadi kata akhir)

Kemudian ambil satu kata dalam daftar dan ramban peta awalan seperti Node pohon

import sys, fileinput
def p(w):return w[:3]

class Computing:
    def __init__(_):
        _.wordData = []; _.prefixes = {}
        _.seq = []; _.bestAttempt = []
        _.stop = False
        for l in fileinput.input():
            word = l.strip()
            if len(word) > 2:_.wordData.append(_.addPfx(word, p(word)))
        _.rmUseless();_.rmUseless()
        _.fromI = 0; _.toI = len(_.wordData)

    def rmUseless(_):
        for w in _.wordData:
            if not w[-3:] in _.prefixes: _.wordData.remove(_.rmPfx(w,p(w)))

    def addPfx(_, w, px):
        if not px in _.prefixes:_.prefixes[px] = []
        _.prefixes[px].append(w)
        return w
    def rmPfx(_,w,px):
        _.prefixes[px].remove(w)
        if len(_.prefixes[px]) == 0:del _.prefixes[px];
        return w

    def findBestSequence(_):
        def pickItem():
            r = None
            if _.fromI < _.toI:r = _.wordData[_.toI-1];_.toI -= 1;
            return r
        while not _.stop:
            w = pickItem()
            if not w:break;
            _.seq = [_.rmPfx(w,p(w))]
            _.checkForNextWord()
            _.addPfx(w, p(w))

        print " ".join(_.bestAttempt)

    def checkForNextWord(_):
        _.stop = len(_.seq) >= 1733
        cw = _.seq[-1]
        if not _.stop:
            if cw[-3:] in _.prefixes:
                lW = []
                for word in _.prefixes[cw[-3:]]:
                    if not word[-3:] in lW:
                        _.seq.append(_.rmPfx(word,p(word)))
                        _.checkForNextWord()
                        if _.stop :break;
                        _.addPfx(_.seq.pop(), p(word))
                        lW.append(word[-3:])
                    if _.stop :break;
        if len(_.bestAttempt) < len(_.seq):_.bestAttempt = _.seq[:];

sys.setrecursionlimit(6000)
Computing().findBestSequence()

Program perlu sejumlah kata untuk mengetahui kapan berhenti, dapat ditemukan seperti 1733dalam metode inicheckForNextWord̀

Program membutuhkan path file sebagai argumen

Tidak terlalu pythonic tetapi saya mencoba.

Butuh waktu kurang dari 2 menit untuk menghitung urutan ini: output penuh :

Dengan desalate, makan ateliers ... beri harga minuman dingin yang berharga

Retas
sumber
4

Nilai: 249 500 1001

Aachen -> ayam -> selanjutnya -> bersemangat -> memerlukan -> ail -> sakit -> led -> ledger -> gerbil -> bilateral -> rallying>> cerdik -> -> digulingkan -> membosankan -> penggulingan -> terabit -> bit -> menggerutu -> landak -> babi -> hogan -> jantan -> derail -> derail -> ailerons -> onset -> set -> setback -> pengakuan -> mensyaratkan -> ledgers -> ersatzes -> zest -> didirikan -> hedgerow -> row -> rowboat -> oat - -> oaten -> sepuluh -> dapat dipertahankan -> pemutih -> sakit -> lebih murah -> pena -> penalizes -> bersemangat -> fulcra -> kepiting -> rabbi -> rabbi -> calabash -> ash -> malu -> medali -> dale -> ale -> waspada -> membosankan -> licik ->slyest -> mendirikan -> hes -> ragu-ragu -> semut -> antasida -> sari -> tergelincir -> tepian -> gestated -> kebosanan -> esai -> esai -> say - -> pepatah -> cerdik -> kecerdikan -> esai -> cerdik -> menggulingkan -> kecerdikan -> esais -> tanah genting -> tanah genting -> berotot -> kucing -> bencana alam -> tikus -> es -> gunung es -> erg -> ergonomis -> mikra -> kepiting -> tempat tidur -> bedazzles -> lesbian -> jawaban -> itu -> ere - -> mendirikan -> menelan -> membangun -> menelan -> menelan -> ion -> ionisasi -> ionizer -> nol -> mengikis -> ode -> odes -> desalinates -> test -> establishment - -> entailing -> ingot -> got -> gotten ->penyewa -> antagonis -> isthmus -> wijen -> ameba -> basal -> salaamed -> medali -> pengion -> ingraining -> ingrains -> ingrains -> ins -> insane - -> anekdotal -> bedak -> alkohol -> tahan -> tua -> dahulu -> den -> denatur -> urea -> jangkauan -> sakit -> pagar -> pagar gestates -> testable -> bleached -> hedging -> ingrates -> testament -> entangle -> gleamed -> medali -> onshore -> ore -> oregano -> anodes - -> desalinating -> ingratiates -> testates -> tester -> terabits -> nya -> itu sendiri -> elf -> elfin -> fin -> finagle -> finagle -> gleaming -> ingratiating -> ingratiatingly -> gliserin -> kulit -> hutang -> esens ->sesar -> jawab -> pemutih -> dia -> pemberita -> alder -> derailing -> bahan -> belitan -> terjerat -> lesi -> ionosfer -> ereksi - -> ionospheres -> dijual kembali -> mengingatkan -> ingresses -> wijen -> mes -> mesas -> sash -> ashcan -> can -> canard -> ardor -> dorkiest -> estate -> testes -> testicle -> cleaner -> nerdiest -> esteemed -> meddles -> lessee -> see -> seeded -> dedicates -> testicles - -> kurangi -> senat -> testiest -> penghargaan -> tumbuh ke dalam -> pemilik -> pemilik -> saraf -> vesikel -> terbersih -> ester -> terabytes -> testis -> tissue -> sue -> suede -> edema -> emaciates ->testosteron -> satu -> yang -> sarang -> esthetes -> testy -> sty -> styes -> ya -> yeshiva -> vaskuler -> larches -> ragu - ragu - -> gliserin -> termakan -> pemutih -> ragu-ragu -> bersaksi -> ingrate -> ate -> ateliers

Ini kode saya:

import sys
sys.setrecursionlimit(10000)
w=[i for i in open("words.txt").read().split("\n") if len(i)>=3 and "'" not in i]
b=[i[:3] for i in w]
e=[i[-3:] for i in w]
a=[i for i in zip(*(w,b,e))]
def pathfrom(i,a,l):
    longpath=[]
    for j in a:
        if j[1]==i[2]:
            r=pathfrom(j,[m for m in a if m!=j],l+1)
            path=[i]+r[0]
            if r[1]:
                return path,r[1]
            if len(path)>len(longpath):
                longpath=path
    if l>=250:
        return longpath,True
        sys.exit()
    return longpath,False
for i in a[:2]:
    print i
    p=pathfrom(i,[j for j in a if i!=j],1)
    if len(p)>len(chain_):
        chain_=p
        print p
    print p

Sunting: 1001:

http://pastebin.com/yN0eXKZm

Edit: 500:

Aachen -> ayam -> selanjutnya -> bersemangat -> memerlukan -> ail -> sakit -> led -> ledger -> gerbil -> bilateral -> rallying>> cerdik -> -> digulingkan -> membosankan -> penggulingan -> terabit -> bit -> menggerutu -> landak -> babi -> hogan -> jantan -> derail -> derail -> ailerons -> onset -> set -> setback -> pengakuan -> mensyaratkan -> ledgers -> ersatzes -> zest -> didirikan -> hedgerow -> row -> rowboat -> oat - -> oaten -> sepuluh -> dapat dipertahankan -> pemutih -> sakit -> lebih murah -> pena -> penalizes -> bersemangat -> fulcra -> kepiting -> rabbi -> rabbi -> calabash -> ash -> malu -> medali -> dale -> ale -> waspada -> membosankan -> licik ->slyest -> mendirikan -> hes -> ragu-ragu -> semut -> antasida -> sari -> tergelincir -> tepian -> gestated -> kebosanan -> esai -> esai -> say - -> Mengatakan -> Cerdik -> Kecerdasan -> Esai -> Cerdik -> Menggulingkan -> Kecerdasan -> Esais -> Tanah Genting -> Musuh -> Berotot -> Kucing -> Bencana Alam -> tikus -> es -> gunung es -> erg -> ergonomis -> mikra -> kepiting -> tempat tidur -> bedazzles -> lesbian -> jawaban -> itu -> ere - -> mendirikan -> menelan -> membangun -> menelan -> menelan -> ion -> ionisasi -> ionizer -> nol -> mengikis -> ode -> odes -> desalinates -> test -> establishment - -> entailing -> ingot -> got -> gotten ->penyewa -> antagonis -> isthmus -> wijen -> ameba -> basal -> salaamed -> medali -> pengion -> ingraining -> ingrains -> ingrains -> ins -> insane - -> anekdotal -> bedak -> alkohol -> tahan -> tua -> dahulu -> den -> denatur -> urea -> jangkauan -> sakit -> pagar -> pagar gestates -> testable -> bleached -> hedging -> ingrates -> testament -> entangle -> gleamed -> medali -> onshore -> ore -> oregano -> anodes - -> desalinating -> ingratiates -> testates -> tester -> terabits -> nya -> itu sendiri -> elf -> elfin -> fin -> finagle -> finagle -> gleaming -> ingratiating -> ingratiatingly -> gliserin -> kulit -> hutang -> esens ->sesar -> jawab -> pemutih -> dia -> pemberita -> alder -> derailing -> bahan -> keterjeratan -> terjerat -> lesi -> ionosfer -> ereksi - -> ionospheres -> dijual kembali -> peringatan -> ingresses -> wijen -> mes -> mesas -> sash -> ashcan -> can -> canard -> ardor -> dorkiest -> estate -> testes -> testicle -> cleaner -> nerdiest -> esteemed -> meddles -> lessee -> see -> seeded -> dedicates -> testicles - -> kurangi -> senat -> testiest -> harga -> tumbuh ke dalam -> pemilik -> pemilik -> saraf -> vesikel -> terbersih -> ester -> terabytes -> testis -> tissue -> sue -> suede -> edema -> emaciates ->testosteron -> satu -> yang -> sarang -> esthetes -> testy -> sty -> styes -> ya -> yeshiva -> vaskuler -> larches -> ragu-ragu - -> gliserin -> termakan -> pelembab -> keratin -> timah -> tingtur -> uretra -> bajingan -> kalamin -> tidak dapat dikonsumsi -> paling buruk -> estetik -> estetika -> tic -> tick -> ickiest -> estimable -> bleariest -> estimator -> tor -> torched -> hedonistic -> ticker -> kerchieves -> vesicles -> lessens - -> berlindung -> cedar -> dare -> are -> area -> terjangkau -> bleat -> eat -> eatable -> bleeder -> derailment -> enter -> enter -> istilah -> ermine -> tidak terlukiskan -> bleeped -> pedagog -> goggle -> gleans ->jawab -> merah -> redbreast -> aster -> termagant -> antagonis -> tiket -> kettledrum -> rum -> rumbas -> basalt -> altar -> altar -> tar - -> tarantula -> lasagna -> gnarliest -> pengasingan -> dimasukkan -> redcap -> topi -> mampu -> bleeps -> epsilon -> loneliest -> estranges -> menunjuk -> redcaps -> apse -> nama samaran -> nymphomania -> niacin -> cinchonas -> nasal -> laku -> campuran -> end -> endanger -> geriatric - -> beras -> icebound -> undeceives -> vesper -> per -> perambulator -> tore -> ores -> resales -> lesser -> sera -> era -> era -> ruam -> ashen -> antek -> man -> manacle -> cleanliest ->estrogen -> gendarmes -> mescal -> kalsium -> tidak efisien -> menghibur -> gliserol -> peran -> oleander -> kekacauan -> hiburan -> hiburan -> hiburan -> tidak terpuaskan - -> dicampur -> disimpulkan -> pohon aras -> arsenik -> bagus -> lemari es -> kotak -> gerbong -> mobil -> caracul -> cullender -> deranges -> gestures -> reschedules -> lesson -> son -> sonar -> narc -> arc -> arcade -> adenoidal -> dales -> lessor -> sorbet -> bet - -> betaken -> ken -> kens -> ensemble -> blender -> deride -> idea -> diaken -> con -> cekung -> pembalas -> pembalas -> erat -> anemia -> miaowed -> nikah -> nikah -> deducible -> blent - -> enthrall ->all -> allay -> lay -> layaway -> way -> wayfarer -> reran -> ran -> rancher -> pemberita -> deductible -> diberkati -> diberkati -> sedan - -> menari -> cedes -> descant -> trenggiling -> disebut -> meddlesome -> omegas -> gas -> gash -> ashram -> ram -> ramble -> hew -> lewder -> derides -> descend -> endangered -> redcoat -> sumpah -> atherosclerosis -> sis -> sisal - -> salad -> lad -> ladder -> ladder - -> turunan -> kapal -> jarang -> domain -> tertulis -> bedbug -> bug -> bugaboo -> boo -> boobed -> bedder -> derives -> sisa -> gesundheit -> baik -> heraldik -> dadu -> pemecah es -> minyak tanah -> enema -> email ->penyakit -> penobatan -> semangat -> gunakan -> digunakan -> sedater -> terminator -> toreador -> dormant -> antebellum - -> lumbago -> lalu -> agonizingly - -> glikogen -> gender -> dermis -> misadventures -> rescind -> tidak senonoh -> antusias -> obat penenang -> jubah -> penggemar -> astir -> tirades -> keturunan -> anteseden -> membujuk -> icecap -> kapasitor -> siksaan -> dibujuk -> cedilla -> llama -> amalgam -> gambit -> pelacur -> pelacur -> ragu-ragu - -> bersaksi -> ingrate -> ate -> ateliersgender -> dermis -> misadventures -> rescind -> tidak senonoh -> antusias -> obat penenang -> jubah -> penggemar -> astir -> tirades -> keturunan -> keturunan -> anteseden - -> menarik -> icecap -> kapasitor -> siksaan -> dibujuk -> cedilla -> llama -> amalgam -> gambit -> pelacur -> pelacur -> ragu-ragu -> bersaksi -> ingrate -> ate -> ateliersgender -> dermis -> misadventures -> rescind -> tidak senonoh -> antusias -> obat penenang -> jubah -> penggemar -> astir -> tirades -> keturunan -> keturunan -> anteseden - -> menarik -> icecap -> kapasitor -> siksaan -> dibujuk -> cedilla -> llama -> amalgam -> gambit -> pelacur -> pelacur -> ragu-ragu -> bersaksi -> ingrate -> ate -> ateliers

KSFT
sumber
2

Mathematica 1482 1655

Sesuatu untuk memulai ...

dict=Import["words.txt"];
words=Union@Select[StringSplit[dict],(StringFreeQ[#,"'s"])\[And]StringLength[#]>2
  \[And]LowerCaseQ@StringTake[#,1]&]

Tautan adalah awalan persimpangan dan sufiks untuk kata-kata.

prefixes=Union[StringTake[#,3]&/@words];
suffixes=Union[StringTake[#,-3]&/@words];
links=Intersection[prefixes,suffixes];
linkableWords=(wds=RandomSample@Select[words,MemberQ[links,StringTake[#,3]]\[And]MemberQ[links,StringTake[#,-3]]& ])/.
w_String:> {w,StringTake[w,3],StringTake[w,-3]}

Tepi adalah semua tautan terarah dari satu kata ke kata lain:

edges[{w_,fr_,fin_}]:= Cases[linkableWords,{w1_,fin,_}:> (w\[DirectedEdge]w1)]
allEdges=Flatten[edges/@linkableWords];
g=Graph@allEdges;

Temukan jalur antara "memperbaiki" dan "semangat".

FindPath[g, "begin", "end", {1480, 3000}, 1][[1]]

Hasil (1655 kata)

{"mend", "endorser", "server", "vertebral", "rallying", "ingrains", 
"insurrectionist", "isthmus", "mussels", "elsewhere", "erection", 
"ionizes", "zestful", "fullness", "essaying", "ingeniously", 
"slyest", "estimator", "tornados", "doses", "sesame", "amebic", 
"bicycled", "ledges", "gestation", "ionizing", "ingratiates", 
"testifying", "ingesting", "inglorious", "ouster", "terminated", 
"tediousness", "essayist", "isthmuses", "session", "ion", 
"ionization", "ionospheres", "resubmitted", "tedious", "ousting", 
"ingest", "ester", "terminates", "testicle", "cleanliness", "essay", 
"say", "saying", "ingratiating", "ingratiatingly", "glycerine", 
"inefficient", "entrances", "cesarians", "answering", "ingenious", 
"ousted", "tediously", "sly", "slyness", "essences", "cesareans", 
"answer", "were", "erecting", "ingredient", "enterprises", 
"sessions", "onshore", "oregano", "anorak", "raking", "ingraining", 
"ingrown", "owner", "nerdiest", "estranging", "ingot", "gotten", 
"tendonitis", "tissue", "suede", "edelweiss", "issuing", "ingestion", 
"ionosphere", "erections", "onset", "settles", "lesion", "ionizer", 
"zeroing", "ingresses", "sesames", "mesmerizing", "ingrates", 
"testes", "testiest", "estrangement", "entail", "ail", "ailment", 
"entice", "icecap", "captivates", "testy", "sty", "stylistic", 
"tickles", "lessee", "seeded", "deductibles", "lesser", 
"servicewoman", "many", "anymore", "ores", "resourceful", "fullback", 
"acknowledgment", "entertainer", "nerves", "vest", "esteemed", 
"mediates", "testament", "entered", "redbreast", "astonishes", 
"hesitatingly", "glycogen", "genera", "eras", "rashes", "hesitates", 
"testicles", "lest", "establishment", "entwines", "nest", "estates", 
"testates", "testosterone", "oneself", "elf", "elfin", "fingered", 
"redcaps", "apse", "pseudonym", "nymphomania", "niacin", "cinemas", 
"masochistic", "tickled", "led", "ledger", "geriatric", "rice", 
"icebreaker", "kerosine", "inexperienced", "ceded", "deductible", 
"blew", "lewder", "derivable", "blemished", "hedgerow", "rowel", 
"welfare", "arena", "enamel", "melded", "dedicates", "tester", 
"terabit", "bitmap", "mapped", "pedicures", "restored", "redeemer", 
"merchantman", "manipulator", "torpedos", "dosed", "seduced", 
"cedilla", "llano", "another", "heretic", "tic", "ticker", "keratin", 
"tinctures", "restaurateur", "euros", "rosettes", "testable", 
"bleaker", "kerosene", "energizer", "zero", "eroded", "deduced", 
"cedar", "dare", "ares", "respondent", "entranced", "cedillas", 
"lasagnas", "nastiest", "esthetic", "ticket", "ketches", "hes", 
"hesitant", "antipasto", "stoppered", "redounded", "deducible", 
"bleeped", "pedant", "antimatter", "terminable", "blent", "enthuse", 
"user", "serenade", "adenoidal", "dales", "lessen", "sentimental", 
"talker", "kerchieves", "vestry", "try", "tryout", "outdone", "ones", 
"nestles", "lesson", "songwriter", "terrapin", "pinched", 
"hedonistic", "tick", "ickiest", "established", "hedgehog", "hogan", 
"gander", "derringer", "gerbil", "billboard", "ardor", "dorkiest", 
"estrogen", "gent", "entirety", "etymological", "calk", "alkalis", 
"lissome", "omegas", "gasolene", "enema", "emaciates", "test", 
"estranges", "gestured", "redeemed", "medic", "diced", "cedars", 
"arsenic", "nice", "iceberg", "erg", "ergonomic", "microcomputer", 
"terser", "sergeant", "antipastos", "tost", "osteopathy", "thy", 
"thymus", "mussiest", "estimable", "blend", "endeavored", "redound", 
"undercover", "verbal", "balk", "alkali", "alibi", "ibis", "bison", 
"sonar", "narcosis", "sister", "terraced", "cede", "edema", 
"emancipator", "torpor", "portraiture", "urea", "reassign", 
"ignoble", "blenched", "hedges", "gesture", "urethras", "raspy", 
"spyglass", "ass", "assailant", "antiquarians", "answered", 
"reduced", "cedes", "despair", "airfares", "resumed", "medicine", 
"ineffable", "bleacher", "herdsmen", "menhaden", "dent", 
"entitlement", "enticement", "entangle", "gleamed", "medullas", 
"lassie", "sieve", "even", "vender", "derivatives", "vessel", 
"selectmen", "mentor", "toreador", "dormer", "meringue", "guerrilla", 
"llanos", "nosedove", "overact", "actionable", "bleeps", "epsilon", 
"longhorn", "ornament", "entreaty", "atypical", "calendar", "dares", 
"resurgent", "entreat", "eater", "term", "ermine", "inedible", 
"bleeder", "derrières", "resentful", "fulcra", "crabbed", 
"bedevilment", "entwine", "inelegant", "antitoxins", "inspired", 
"redder", "derides", "descendant", "antihistamine", "inequitable", 
"bleat", "eaten", "tenured", "redcap", "capstans", "answerable", 
"blender", "deranges", "gestures", "restart", "arteriosclerosis", 
"sis", "sisal", "saltpeter", "terrifyingly", "glycerin", "rink", 
"inkwell", "ellipsis", "sisterhood", "oodles", "lessor", "sorrowed", 
"wedges", "gesundheit", "either", "hereafter", "termite", "iterator", 
"tornado", "adobes", "bespoken", "ken", "kens", "ensnare", "area", 
"rear", "earful", "fulfil", "fillet", "letdown", "ownership", 
"hipped", "pediatric", "richer", "heretical", "calculus", "lusher", 
"heraldic", "dice", "icebound", "underscored", "redskins", "instant", 
"antiperspirant", "anthropomorphic", "hiccup", "cup", "cups", 
"upstage", "agendas", "dashingly", "glycerol", "role", "oleo", 
"leonine", "ineluctable", "blessed", "sedatives", "vesicles", 
"lessens", "ensured", "redefine", "inextinguishable", "bleach", 
"achoo", "hooch", "ocher", "hero", "erode", "ode", "odes", "desktop", 
"topple", "pleasured", "redeveloped", "pediment", "entrapped", 
"pederasty", "stylus", "lush", "usher", "hermaphrodite", "item", 
"tempos", "postpaid", "aide", "ideogram", "rampart", "artisan", 
"sandhog", "hog", "hogwash", "ash", "ashram", "rammed", "mediocre", 
"crestfallen", "lend", "endow", "downcast", "astronomer", 
"merriment", "entrant", "antiwar", "warden", "dentures", "restful", 
"fulfillment", "entrapment", "enthrall", "allay", "layout", 
"outbound", "underclassman", "manhole", "oleander", "dermis", 
"misused", "sedater", "terrific", "fiche", "cheapens", "ensnares", 
"restrains", "insolent", "entombed", "bedraggle", "gleeful", 
"fulfilment", "entrenchment", "entrap", "rapper", "persistent", 
"enthronement", "enthusiast", "astute", "uterus", "rustproofed", 
"fedora", "orangeades", "despised", "seducer", "ceramic", 
"microscopic", "picnic", "nicotine", "inexpedient", "entomb", 
"ombudsman", "mantel", "teletypewriter", "terminological", "calif", 
"lifetimes", "mescaline", "inertia", "tiaras", "raster", "terrace", 
"acetaminophen", "henchmen", "menhadens", "enslaves", "vesper", 
"peroxide", "ideograph", "aphid", "hides", "desideratum", "tumor", 
"mortgagee", "geegaw", "gawk", "awkward", "ardent", "enthused", 
"sediment", "enter", "termed", "mediaeval", "valentine", "inexact", 
"actives", "vestment", "entourage", "agent", "entryway", "wayside", 
"idea", "dear", "earache", "checkups", "upsides", "descent", 
"entertainment", "entomological", "calicos", "cosign", "ignored", 
"redcoat", "oaten", "tensed", "sedan", "dank", "anklet", "lettered", 
"redskin", "kingpin", "pinups", "ups", "upshot", "hotbed", 
"bedtimes", "mes", "messenger", "germicides", "destitute", "utensil", 
"silencer", "cervix", "vixens", "ensign", "ignorant", "antipasti", 
"stimulus", "lusty", "stymie", "miens", "enslave", "averred", 
"redrew", "rewritten", "tenpins", "instructor", "torrent", 
"entertains", "insult", "ultrasound", "undersides", "despoil", 
"oilcloth", "other", "hereupon", "pondered", "redundant", "anthill", 
"ill", "illicit", "citizens", "ensnared", "rediscovered", "redesign", 
"ignoramus", "muskmelon", "longer", "gerrymander", "deride", "ideas", 
"easy", "asylum", "lumbermen", "mendicant", "antlered", "redevelop", 
"lopes", "pester", "terrapins", "instil", "tildes", "deserves", 
"vesicle", "cleave", "avenger", "germane", "anemia", "miasmas", 
"mash", "ashy", "shy", "shyster", "termagant", "antiaircraft", 
"afterglow", "lowland", "and", "androgen", "genitalia", "liars", 
"arson", "sonatas", "taste", "stepsister", "termini", "initiator", 
"tor", "torn", "ornamental", "tallow", "lowered", "red", "redraft", 
"aft", "aftertaste", "stereotypes", "pesky", "skyrocket", 
"kettledrum", "rummer", "merciful", "fulsome", "omens", "ensures", 
"resultant", "antennas", "nasal", "saleswoman", "mane", "anemometer", 
"terrains", "insightful", "fulcrum", "rumbas", "baseman", 
"mannikins", "insures", "resound", "underpass", "assassins", "inset", 
"settee", "teethe", "theological", "calf", "alfresco", "scornful", 
"fulfill", "illustrator", "torpid", "pidgin", "gins", "instal", 
"talc", "alcove", "overtakes", "kestrel", "relabel", "beleaguered", 
"redraw", "rawhide", "identical", "caliber", "beret", "retrace", 
"acetylene", "enemas", "massacred", "redeploys", "oyster", 
"terminator", "tortillas", "last", "astronomical", "calliope", 
"operator", "tort", "orthographic", "hiccups", "upstart", 
"artificer", "cervical", "callus", "lustre", "trend", "endeavor", 
"vortex", "textures", "researcher", "heroins", "instill", "illegal", 
"galloped", "pedagogical", "callipered", "rediscover", "vertebra", 
"brasher", "herbicides", "descry", "cryptogram", "ramrod", "rodeo", 
"deodorizer", "zeros", "rosebush", "ushered", "redden", "denatures", 
"reset", "setups", "upside", "ides", "describes", "besides", 
"desperado", "adores", "reshuffle", "flea", "leaflet", "lethal", 
"halibut", "but", "button", "tonic", "niche", "cherubim", "bimbos", 
"bosun", "sunk", "unkind", "indentures", "resend", "endures", 
"restorer", "reran", "rang", "anger", "germicide", "ideological", 
"calabash", "ashamed", "medical", "caloric", "rickshas", "hasten", 
"tendon", "donkey", "keyword", "ordains", "insecticides", "desires", 
"resin", "sins", "inspector", "torrid", "rid", "rides", "despot", 
"potpie", "piebald", "aldermen", "menace", "ace", "acerbic", "bicep", 
"cephalic", "lichen", "hennas", "nasty", "styes", "yesterday", "day", 
"daybed", "bedridden", "dental", "talisman", "mankind", "indignant", 
"antique", "questionnaires", "resubmit", "mitten", "tenfold", "old", 
"olden", "denudes", "design", "ignores", "resumes", "mesdames", 
"mesas", "sass", "assemblywoman", "mangle", "glee", "leeway", 
"waylay", "laywomen", "menswear", "ear", "earldom", "domains", "ins", 
"instrumental", "tall", "all", "allegorical", "calm", "almanac", 
"nacre", "credit", "dittos", "tossup", "superman", "mandolin", 
"linesman", "manacle", "cleverer", "rerun", "runaway", "way", 
"wayfarer", "reruns", "unshaven", "ventures", "resell", "elliptical", 
"calmer", "mercuric", "ricochet", "heterodoxy", "oxymora", 
"orangutang", "angina", "inapt", "apt", "aptitudes", "descend", 
"endear", "earlobes", "bestowal", "walleyes", "yes", "yeshivas", 
"vassal", "saltcellar", "larval", "valiant", "anthropological", 
"calfskin", "kind", "inductee", "tee", "teenager", "gerund", 
"underclass", "assemblyman", "manservant", "antelopes", "peso", 
"esoteric", "rickshaw", "hawser", "servicewomen", "mental", 
"tallyhos", "hospital", "talon", "longshoremen", "men", "menthol", 
"holography", "phylum", "lumberman", "manikin", "kingpins", 
"install", "allures", "resuscitator", "tortilla", "llamas", 
"massacres", "resistor", "tormentor", "torque", "queasy", 
"asymmetric", "ricksha", "sharped", "pedlar", "largos", "gossamer", 
"merganser", "service", "icebox", "boxer", "xerography", "physical", 
"calculator", "tortures", "resonant", "anticlimax", "maxima", "imam", 
"mammon", "monograph", "aphelia", "liaison", "sonic", "nicknamed", 
"media", "diametrical", "calliper", "performed", "medulla", "llama", 
"amalgam", "gamins", "insulin", "lineman", "mantra", "transplant", 
"antigen", "genres", "respires", "resold", "oldie", "diesel", 
"seldom", "domed", "medieval", "valor", "lordship", "hipper", "per", 
"perspires", "restores", "restructures", "resort", "orthodoxy", 
"oxygen", "gentlemen", "menopausal", "saltpetre", "treacle", 
"cleaver", "verdigris", "risen", "send", "end", "endemic", 
"microfiche", "checkout", "outclass", "assault", "ultraviolet", 
"let", "letterbox", "boxcar", "carom", "roman", "manifesto", 
"stovepipes", "pesticides", "described", "bedsides", "descant", 
"anthem", "hempen", "penguins", "insignificant", "antebellum", 
"lumbar", "barracudas", "dash", "ashcan", "cannonball", "allover", 
"verbena", "enamor", "morgue", "guerrillas", "lash", "ashen", 
"henchman", "mandolins", "inspires", "resistant", "antechamber", 
"bereave", "aver", "vermin", "minim", "nimbus", "bus", "businessman", 
"mantras", "rasp", "asphalt", "altogether", "her", "hereabout", 
"outcast", "astrological", "calisthenic", "nicknames", "mescal", 
"calliopes", "pesetas", "tassel", "selectman", "mannikin", 
"kinswoman", "man", "manic", "nicer", "cerebra", "bravado", "adobe", 
"obeisant", "antiparticle", "clever", "versus", "sushi", "shirr", 
"irrelevant", "antelope", "open", "pentagon", "gonad", "nadir", 
"directorship", "hippopotami", "amid", "midwifed", "fedoras", 
"rasher", "herbal", "ball", "allot", "lot", "lotus", "tussle", 
"sledgehammer", "merchant", "ant", "antidepressant", "anther", 
"heraldry", "drywall", "allegros", "rosebud", "budgerigar", 
"garbageman", "manikins", "inscribes", "bestow", "townsmen", "menu", 
"enures", "restaurant", "antithetical", "calico", "icon", "confound", 
"underbid", "bidden", "denser", "seraphic", "hiccuped", "pedigree", 
"reeve", "ever", "vertical", "caliper", "perusal", "salami", "amir", 
"mires", "restraint", "interstellar", "larkspur", "puritanical", 
"calligrapher", "herdsman", "manatee", "teepee", "peeve", "everyday", 
"daydreamer", "meres", "result", "ultimatum", "tumbril", "rill", 
"illogical", "calligraphy", "physic", "sickbed", "bedsores", 
"resolver", "vertebras", "rascal", "call", "allergenic", "nickname", 
"amebas", "baste", "stepson", "son", "sonnet", "net", "nether", 
"heros", "rosins", "insular", "larvas", "vast", "astrakhan", 
"handyman", "manicures", "resins", "instep", "tepid", "pidgins", 
"inscribed", "bedbug", "bug", "bugbear", "earwax", "waxen", 
"xenophobia", "biathlon", "longhair", "airstrip", "ripple", "pleas", 
"eastbound", "underachiever", "verbatim", "timbre", "brew", 
"rewound", "underplay", "laywoman", "mandarins", "insofar", "farm", 
"armpit", "pitcher", "herald", "alderman", "mangos", "gossip", 
"sipped", "pedagogue", "guerillas", "laser", "serape", "aped", 
"pederast", "astound", "underground", "underpins", "insane", 
"anemic", "micra", "crane", "anew", "new", "newscast", "astir", 
"tiro", "ironware", "are", "areas", "east", "astronomic", 
"microchip", "hippopotamus", "mustache", "chervil", "villas", "lass", 
"assassin", "sinew", "newsman", "mangrove", "overtax", "taxicab", 
"cabana", "anathemas", "mast", "astronaut", "author", "horoscope", 
"opera", "eraser", "serfdom", "dominos", "nostrum", "rumpus", "pus", 
"pushcart", "arthropod", "podia", "diatom", "tomboy", "boycott", 
"ottoman", "manhunt", "untidy", "idyllic", "licensee", "seethe", 
"thereabout", "outplay", "layoff", "officer", "cerebrum", "rum", 
"rumple", "plethora", "oracle", "clergyman", "maneuver", "verandas", 
"dashikis", "kisser", "serum", "rumor", "morbid", "bidet", "detach", 
"achiever", "vertex", "text", "extremer", "merino", "inopportune", 
"uneaten", "tensor", "sort", "orthopedic", "dickie", "kielbasas", 
"sashay", "hayloft", "often", "ten", "tenpin", "pinkeye", "eyeball", 
"allegro", "grout", "outfox", "fox", "foxtrot", "rot", "rotund", 
"underwear", "earshot", "hot", "hotshot", "hotel", "telex", 
"lexicon", "congresswoman", "manor", "northbound", "undertow", 
"township", "hippos", "possessor", "sorbet", "betcha", "chart", 
"art", "article", "clear", "earwig", "wigwam", "wampum", "pummel", 
"melodic", "dictum", "tumbrel", "relic", "licit", "citadel", "delay", 
"lay", "laypeople", "plectra", "traumas", "mascot", "cotyledon", 
"donor", "nor", "normal", "malt", "altar", "tart", "artiste", 
"stencil", "cilantro", "trouper", "pericardia", "diadem", "democrat", 
"rattan", "tang", "angstrom", "romper", "perturb", "urban", "bang", 
"angel", "gelatin", "tint", "intros", "rostra", "trapper", 
"persimmon", "monsignori", "origin", "ginkgos", "gospel", "pelvis", 
"visor", "sorghum", "humid", "midair", "air", "airfoil", "oil", 
"oilskin", "kin", "kindergarten", "tentacle", "cleanser", "sermon", 
"monolog", "logarithmic", "microbes", "bestir", "tiros", "rosin", 
"sin", "singleton", "tonsil", "silicon", "con", "constraint", 
"intagli", "glint", "interwove", "overshadow", "downtrodden", 
"dentin", "tin", "tinsel", "sellout", "out", "output", "put", 
"putsch", "schoolmarm", "arm", "armor", "moribund", "underpin", 
"pint", "interloper", "periwig", "wig", "wigwag", "wagon", 
"gonorrhea", "hearten", "tenon", "nonverbal", "balsam", "samovar", 
"varmint", "interviewee", "weeper", "perturbed", "bed", "bedpan", 
"panache", "chestnut", "nut", "nutmeg", "meg", "megalopolis", 
"lissom", "somersault", "ultra", "tram", "ramp", "amputee", "teeth", 
"ethos", "hos", "hostel", "telescopic", "picayune", "uneven", 
"vendor", "dorsal", "salad", "ladybug", "bugaboo", "boomerang", 
"angora", "orangutan", "tandem", "demagogry", "gryphon", 
"honeycombed", "bedlam", "lamb", "ambergris", "risky", "sky", 
"skycap", "capstan", "tannin", "ninepin", "pinpoint", "interpret", 
"retiree", "reefer", "fer", "ferret", "returnee", "needlepoint", 
"interurban", "bantam", "tamp", "ampul", "pullout", "outrun", 
"runabout", "outstrip", "rip", "ripen", "pennon", "nonfat", "fathom", 
"homespun", "puns", "unsubscribes", "besom", "sombre", "breathe", 
"theatre", "tremor", "mortar", "tarpaulin", "lintel", "telethon", 
"honeydew", "dewlap", "lap", "lapel", "pelvic", "victim", "timpani", 
"animus", "muscat", "cat", "catsup", "sup", "superstar", "taro", 
"arousal", "salamis", "misprint", "interwoven", "venom", "nomad", 
"madam", "dam", "dampen", "penicillin", "lint", "intercom", 
"compound", "underpay", "pay", "payoff", "off", "offal", "fallout", 
"outwit", "withal", "halt", "altho", "tho", "thou", "housebound", 
"undergrad", "radio", "diocesan", "sanserif", "riffraff", 
"affidavit", "vitamin", "minicam", "campus", "pussycat", "catamaran", 
"rancor", "cornucopia", "piano", "anon", "non", "nonpartisan", 
"sandbar", "bar", "barren", "renewal", "walkout", "outruns", 
"unsnap", "naphtha", "thalamus", "musky", "skydove", "overrun", 
"run", "runs", "unsheathe", "the", "theorem", "remove", "overreach", 
"ache", "cherub", "rubes", "beseech", "echo", "chosen", "sensor", 
"sorrel", "relay", "layman", "mantillas", "lasagna", "gnat", 
"natures", "resonator", "torus", "russet", "set", "setback", 
"acknowledgement", "entanglement", "entombment", "entourages", 
"gestates", "testing", "ingratiate", "ate", "ateliers", "ersatzes", 
"zest"}
DavidC
sumber
1

Python, 90

deductible upee lessee jungkat-jungkit menikah menikah menyimpulkan cesarians jawab berkat wijen jerat ragu ionosphere erection ionizer nol erode ode odes deserter termination onshore ores tinggal idealistis tic tickle pelajaran lagu yang sedang berlangsung menanamkan secara tidak sengaja reliing estranges merespon dengan penuh semangat sangat berterima kasih sangat menyenangkan. mengakui ruam uretra gesture ragu-ragu antiklimaks maxima bayangkan inebriates diuji dengan sangat licik esensi sesar jawaban werewolves rompi werewolf termasyhur omelet lettuces cession ionisasi ionisasi menelan ionosfer penyelamatan cueing yang ditanamkan kepemilikan hip hippie kesalehan etimologis saat sesi dipimpin diselesaikan.

Pertama saya membersihkan daftar secara manual dengan menghapus semua

  • kata-kata dengan huruf kapital
  • kata-kata dengan apostrof
  • kata-kata dengan éêèáâàö
  • Kata 1- dan 2 huruf

ini menghabiskan paling banyak 2 poin karena kata-kata itu hanya bisa di awal atau akhir rantai tetapi mengurangi daftar kata dengan 1/3 dan saya tidak harus berurusan dengan unicode.

Selanjutnya saya membuat daftar semua pra- dan sufiks, menemukan tumpang tindih dan membuang semua kata kecuali kedua pra- dan sufiks berada di set tumpang tindih. Sekali lagi, ini mengurangi paling banyak 2 poin dari skor maksimal yang dapat saya peroleh, tetapi mengurangi daftar kata menjadi sepertiga dari ukuran aslinya (coba jalankan algoritme Anda pada short_list untuk kemungkinan percepatan) dan kata-kata lainnya sangat terhubung (kecuali beberapa 3 kata-newsletter yang terhubung hanya untuk diri mereka sendiri). Faktanya, hampir semua kata dapat dicapai dari kata lain melalui jalur dengan rata-rata 4 tepi.

Saya menyimpan jumlah tautan dalam sebuah matriks adjacency yang menyederhanakan semua operasi dan memungkinkan saya melakukan hal-hal keren seperti melihat n langkah di depan atau menghitung siklus ... setidaknya secara teori, karena dibutuhkan sekitar 15 detik untuk menyelesaikan matriks yang sebenarnya tidak saya lakukan. ini selama pencarian. Alih-alih, saya mulai dengan awalan acak dan berjalan secara acak, baik memilih akhir yang seragam, mendukung yang sering terjadi (seperti '-ing') atau yang jarang terjadi.
Ketiga varian menyedot sama rata dan menghasilkan rantai dalam kisaran 20-40, tetapi setidaknya cepat. Sepertinya saya harus menambahkan rekursi.

from numpy import *
f = open('words_short.txt')
words = f.read().split() # 62896
f.close()

prefix = [w[:3] for w in words]     # 2292
suffix = [w[-3:] for w in words]    # 2262
common = set(prefix) & set(suffix)  # 1265

PSW = [(p,s,w) for (p,s,w) in zip(prefix, suffix, words) if p in common and s in common] # 28673
common = list(common)
mapping = dict(zip(common, range(len(common)))) # enumerate trigrams

M = zeros((len(common), len(common)), dtype=int) # for fast processing
W = [[[] for i in range(len(common))] for j in range(len(common))] # for reconstruction
for p,s,w in PSW: # build adjacency matrix
    M[mapping[p], mapping[s]] += 1
    W[mapping[p]][mapping[s]].append(w)

def chain(A, rho=0):
    B = array(A)
    links = []
    start = random.randint(len(B))
    links.append(start)
    while 1:
        nextpos = where(B[links[-1],:]>0)[0]
        if len(nextpos)==0: return links
        nextnum = B[links[-1],nextpos]

        p = ones(len(nextnum))/len(nextnum) # pick uniformly
        if rho>0: p = nextnum*1./sum(nextnum) # prioritize many links
        if rho>1: p = 1/p; p = p/sum(p) # prioritize few links

        chosen = random.choice(nextpos, p=p)
        B[links[-1], chosen] -= 1
        links.append(chosen)

def chain2words(L):
    # can only be used once because of .pop()
    z = zip(L[:-1],L[1:])
    res = []
    for p,s in z:
        res.append(W[p][s].pop())
    return res

chains = [chain(M) for i in range(100)]
bestchain = chains[argmax(map(len, chains))]
print ' '.join(chain2words(bestchain))

Awalnya saya ingin mencoba sesuatu yang mirip dengan ini tetapi karena ini adalah grafik berarah dengan siklus, tidak ada algoritma standar untuk Penyortiran Topologis, Jalur Terpanjang, Jalur Euler Terluas atau Masalah Tukang Pos Cina bekerja tanpa modifikasi berat.

Dan hanya karena kelihatannya bagus, di sini adalah gambar dari matriks adjacency M, M ^ 2 dan M ^ infinity (infinity = 32, tidak berubah setelah itu) dengan putih = entri bukan nol
masukkan deskripsi gambar di sini

DenDenDo
sumber
Jadi nilaimu 90? Meskipun kami sudah memiliki 1700+ entri .. Ada yang hilang?
Pengoptimal
1
Pertama-tama saya masih mengerjakannya, tetapi selain itu - Sepertinya ide yang bagus, saya mencobanya, gagal. Jika ada, ini akan menghentikan orang dari membuang-buang waktu menggunakan pendekatan yang sama
DenDenDo
Heh :) Pertahankan sikap positif :) Harapan untuk melihat ini mendapatkan hasil yang lebih baik.
Pengoptimal
2
" kata-kata itu hanya bisa di awal atau di akhir rantai " tidak benar. Komponen terhubung terbesar dalam grafik termasuk kata-kata seperti boutonnières yang memiliki karakter beraksen, tetapi tidak dalam awalan atau sufiks. Ini hanya memengaruhi sekitar selusin kata, tetapi salah satunya mungkin merupakan tautan kunci.
Peter Taylor