Latar Belakang
Memori eksternal, atau model DAM, menentukan biaya suatu algoritma dengan jumlah I / Os yang dimilikinya (pada dasarnya, jumlah cache yang hilang). Waktu berjalan ini umumnya diberikan dalam hal , ukuran memori, dan B , jumlah kata yang dapat ditransfer ke memori pada satu waktu. Kadang-kadang L dan Z digunakan untuk B dan M masing-masing.
Misalnya, penyortiran membutuhkan biaya dan multiplikasi matriks naif membutuhkan Θ ( n 3 / B √ .
Model ini digunakan untuk menganalisis "cache-menyadari algoritma", yang tidak memiliki pengetahuan tentang atau M . Secara umum tujuannya adalah untuk algoritma cache-terlupakan untuk melakukan secara optimal dalam model memori eksternal; ini tidak selalu mungkin, seperti dalam masalah Permutasi misalnya (ditunjukkan dalam Brodal, Faderberg 2003 ). Lihat artikel ini oleh Erik Demaine untuk penjelasan lebih lanjut tentang algoritma yang tidak memperhatikan cache, termasuk diskusi tentang penyortiran dan perkalian matriks.
Kita dapat melihat bahwa mengubah menyebabkan percepatan logaritmik untuk menyortir dan percepatan polinomial untuk perkalian matriks. (Hasil ini berasal dari Hong, Kung 1981 dan sebenarnya mendahului kedua ketidaktahuan cache dan formalisasi model memori eksternal).
Pertanyaan saya adalah ini:
Apakah ada kasus di mana speedup eksponensial dalam ? Waktu berjalan akan menjadi seperti f ( N , B ) / 2 O ( M ) . Saya terutama tertarik pada algoritma cache-lupa atau struktur data yang cocok dengan deskripsi ini tetapi akan senang dengan algoritma cache-aware / struktur data atau bahkan batas bawah yang paling terkenal.
Secara umum diasumsikan dalam kebanyakan model bahwa ukuran kata jika N adalah ukuran input dan jelas M > w . Kemudian speedup dari 2 M memberikan percepatan polinomial di N . Ini membuat saya percaya bahwa jika masalah yang saya cari ada, itu bukan polinomial. (Kalau tidak, kita dapat mengubah ukuran cache dengan konstanta untuk mendapatkan jumlah I / Os yang konstan, yang sepertinya tidak mungkin).
Jawaban:
I am not sure I understood the question. But it seems to me that under the assumption thatPSPACE contains problems requiring exponential time, such problems would fulfill your requirements, since if M is O(logN) you would need an exponential number of I/O operations (since you can't "stay" in the same memory block of size O(logN) more than a polynomial number of steps without going into a cycle) and if M=N P M Ω(N) M , intuitively I believe this is not true, because I feel it should be possible to design a problem that is in fact the concatenation of smaller problems of size O(logN) each requiring exponential time in its own size, and an exponential number of I/O operations (that is, poly(N) , since poly(N) is exponential in O(logN) ). In practice I believe PSPACE -complete problems such as TQBF fulfill your condition.
sumber
this question appears to veer into terra incognita ie unexplored territory. after some thinking & review here are some thoughts/ideas on this apparently very difficult/subtle question which hopefully will be intelligent but are not meant to be definitive.
Demain who you cite writes, "the principle idea [of cache oblivious algorithms] is simple: design external-memory algorithms without knowingB and M . But this simple idea has several surprisingly powerful consequences."
it also appears to have surprisingly subtle implications. it appears hard to analyze this model for extremal behavior and it appears nobody has done this so far. there are some surveys and they seem to survey the entire field so far. this is not surprising given the field is only ~1 decade old.
the model does not appear to have been translated to Turing machines yet where more strict/formal/theoretical/general/standard analysis may be possible. the whole concept of Big-Oh notation and its implications and intuitions such as irrelevance of constants is not necessarily automatically carrying in this area of cache oblivious algorithms. for example the model already seems to be working with constantsB,M which measure exact memory sizes. it appears the field does not have a set of basic axioms of its dynamics so far.
TMs were invented in 1936 by Turing and the Hartmanis-Stearns time/space hierarchy theorems (which you are somewhat alluding to in this question) were discovered in 1965. thats a remarkable ~3 decades yet the time/space hierarchy theorems are now considered somewhat elementary and taught in undergraduate classes. there does not seem to be corresponding hierarchy theorems established in this model yet, and how to do that is not a trivial problem. this model actually seems to have more "moving parts" than the standard Turing machine (which already has a rather fiendishly complex dynamics), ie is like an augmented Turing machine.
another idea is to convert this external memory model to TMs somehow, which again does not appear to show up in the literature/surveys on cache oblivious algorithms, and maybe see how the Hartmanis-Stearns hierarchy theorems could translate. it appears to relate to a two tape TM where one tape is size 'M' and the other tape is infinite, and blocks are transferred to 'M' in sizes of 'B'. but also, a difficulty here is it is more of a RAM model than the Turing model which uses sequential access to the tape. on the other hand this could run into subtleties associated with conversions between single and multitape TMs.
suggest attacking this problem empirically with simulations which is where the literature tends to focus on eg as in this great survey by Kumar on Cache oblivious algorithms (2003). there are many programs and papers online for cache simulations and one could possibly answer your question without a large amount of code, using some simplifications. one basic idea is to create probabilistic algorithms which access "nearby" areas of memory based on probabilities. "nearby" here is not necessarily contiguous memory, but instead an algorithm that selects random memory pages (blocks) based on keeping track of its own most recently accessed pages. suggest using power laws to determine the probability of selecting "near" pages in the "most recently accessed" list. this appears to be the key aspect that cache-based performance improvements are related to.
heres a rough argument based on 'M' which is basically a measure of "core memory" versus disk memory. for an algorithm, one would expect as core memory increases, one only comes close [asymptotically] to a linear improvement in algorithmic speed, and adding "core memory" to get any super-linear increase in algorithm speed seems intuitively almost like exceeding the speed limit and "getting something for nothing". however, dont grasp the model well enough to prove this [but apparently no one else has either, including the authorities/founders/experts].
this cache-oblivious algorithm literature has focused on P algorithms and did not see any case at all of an analysis of a non-P algorithm so far and perhaps none exists. this could possibly be because the analysis is too difficult! in which case, questions about extremal behavior may go unanswered in this field in the long term.
it does not even seem intuitively clear, or possibly at all studied yet, about how "small" complexity algorithms such as in L, or "big" algorithms such as non-P (eg Expspace etc) behave in this model. the model is measuring memory locality, which seems to be fundamentally different, but intertwined with, time and space hierarchies.
there is a somewhat obscure measurement of Turing machine complexity called "reversal complexity" with some study, results, and papers that might be related. basically the TM can cover a certain amount of time and space, but reversals are a somewhat independent measurement of the tape head during the computation. it seems that "high reversals" might be related to "high memory locality" because in both cases the tape head is tending to stay in a "smaller" region versus moving into larger regions.
this question and model reminds me of Amdahls law and suspect some kind of yet-undiscovered similar law related to diminishing returns or balances/tradeoffs might hold or be applicable in this area. rough reasoning: cache oblivious algorithm theory is looking at a balance/tradeoff between a finite "local" memory and a cost-based external "infinite" disk. this is basically two resources that behave "in parallel" and there is likely some kind of optimal tradeoff between them.
sumber