Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self referencing example #56

Closed
wants to merge 124 commits into from
Closed

Conversation

jakewilliami
Copy link
Owner

@jakewilliami jakewilliami commented Mar 29, 2023

I don't know what just happened but I had a bunch of local changes that I tried to push and it didn't work, so now I have lost some of those change, and have push a new branch, even though #55 was already merged.

2333ea1 had some of my local changes (a bunch of generated dot files and JSON files and shell scripts were removed unfortunately).

I will probably want to delete this MR and branch in future.

jakewilliami and others added 30 commits October 23, 2020 15:57
Added OS BSD to .travis.yml
Implemented floating point spelled out (closes #14)
Began work on complex numbers (addresses #15)
Removed 1.0 from travis testing, as DecFP requires 1.3 or greater
Gabrielforest and others added 29 commits June 2, 2022 13:58
Implement language support for Portuguese
Instead of creating IOBuffers within each function, pass around IOBuffers to modifying functions and only take from them at the end of the process!  This vastly reduces the allocations as we are no longer creating a new IOBuffer within each function.  However, it is still not as fast as it was (but its memory usage is better).
Instead of using a weird `reverse` method for string manipulation in printing of ordinals, we use a util function.  Also, use `IOBuffer` where practical for alt convert methods
Optimise English implementation
A self-referencing, spelled-out sentence such as "This sentence has
ten 'a's and four 'b's and twelve 'c's and ..." would be a cool
example project to make
Implement Doov's brute force idea, and continue work on my attempt at
a somewhat efficient solution (though I think that Doov's idea is
perhaps the best way, as I can see my solution becoming quite complex)
Previous commits working on this did not finish computing, as they
would form cycles.  The way around this is to use some level of
randomisation.  This commit implements a working solution.
Optimisation and comments to follow.
When we calculate the diffs from previous to current countmaps in
self-referencing example, we previously created a new dictionary.  As
we never use the previous countmap after this point, we can simply
modify it!

This seems to speed things up immensely:
 Without mutation of previous:
     julia> @Btime construct_true_pangram();
       8.100 s (65276305 allocations: 4.59 GiB)

  With mutation of previous:
     julia> @Btime construct_true_pangram();
       2.992 s (26294381 allocations: 1.67 GiB)

Next steps for optimisation would be to reuse the same buffer for
pangram construction.
In this commit, I dynamically determined when we were at the
second-to-last or last item in our alphabet, as well as some other
miscellaneous clean up/improvement.

I had planned to only keep track of one IOBuffer in this commit, but
after some benchmarking, this did not seem to improve performance:

With one IOBuffer:
    julia> for _ in 1:20
               @Btime construct_true_pangram();
           end
      50.843 s (409455124 allocations: 26.26 GiB)
      765.545 ms (6216088 allocations: 408.26 MiB)
      3.035 s (24595631 allocations: 1.58 GiB)
      33.922 s (273593730 allocations: 17.55 GiB)
      2.409 s (19406731 allocations: 1.24 GiB)
      1.898 s (15389028 allocations: 1010.77 MiB)
      11.832 s (95874131 allocations: 6.15 GiB)
      44.362 s (357949407 allocations: 22.96 GiB)
      2.459 s (19967197 allocations: 1.28 GiB)
      58.991 s (475764808 allocations: 30.52 GiB)
      1.158 s (9389309 allocations: 616.70 MiB)
      25.636 s (207005638 allocations: 13.28 GiB)
      51.847 s (418013672 allocations: 26.81 GiB)
      31.430 s (253300737 allocations: 16.25 GiB)
      40.973 s (327232985 allocations: 20.99 GiB)
      4.422 s (35438747 allocations: 2.27 GiB)
      253.188 ms (2041985 allocations: 134.09 MiB)
      36.172 s (292155055 allocations: 18.74 GiB)
      49.306 s (396551677 allocations: 25.44 GiB)
      55.549 s (446664880 allocations: 28.65 GiB)

With IOBuffer constructed in each construct_pangram call:
    julia> for _ in 1:20
               @Btime construct_true_pangram();
           end
      5.503 s (47207580 allocations: 3.00 GiB)
      7.020 s (60222956 allocations: 3.82 GiB)
      12.165 s (103483280 allocations: 6.57 GiB)
      3.002 s (25651341 allocations: 1.63 GiB)
      5.414 s (46572408 allocations: 2.96 GiB)
      5.389 s (45930841 allocations: 2.92 GiB)
      5.182 s (44124955 allocations: 2.80 GiB)
      9.869 s (85081002 allocations: 5.40 GiB)
      2.006 s (17076024 allocations: 1.08 GiB)
      646.730 ms (5628166 allocations: 365.71 MiB)
      666.316 ms (5935178 allocations: 385.65 MiB)
      1.553 s (13790449 allocations: 896.04 MiB)
      11.421 s (101638768 allocations: 6.45 GiB)
      199.693 ms (1800406 allocations: 117.00 MiB)
      20.385 s (180114986 allocations: 11.43 GiB)
      838.932 ms (7439715 allocations: 483.56 MiB)
      868.218 ms (7668002 allocations: 498.32 MiB)
      9.911 s (87764372 allocations: 5.57 GiB)
      890.219 ms (7936633 allocations: 515.73 MiB)
      11.319 s (100671495 allocations: 6.39 GiB

I will have to investigate this a little further, as I don't think it
should make such a difference.  Unfortunately, it's a little tricky to
benchmark with it being random!
While it is true that certain characters in this sentence are never
used in spelling out numbers, creating an initial state accounting for
these characters has not proven useful, and hence this commit removes
this logic.

I'm not entirely sure why, but for some reason, having this initial
state considerably slows down the programme.
Despite notes made in 7849a1c, after testing again it appears that
constructing IOBuffer only once, at the start of the main function,
provides a slight performance improvement (at least in terms of
memory, and perhaps negligible difference in speed, but as previously
noted, it is difficult to benchmark random functions).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants