Setting up Pluto.jl
Pluto is nice as you can prototype pretty fast.
Pluto.jl has its own dependency management included!
If you want to add packages that are not registered, you have to activate your own environment. For example
using Pkg
Pkg.activate(mktempdir())
Pkg.add("/path/to/your/package/CoolPackage")
Pkg.add(url="https://github.com/username/MyPackage.jl")
using CoolPackage,MyPackage
To run pluto in the first place use:
]add Plutorun() Pluto.
Task 1: Visualize some statistic properties
1. Data
Generate 500 normally distributed samples
You might want to make your results reproducible by fixing some seeds for the random generators. The two most common random generators used in julia are Random.MersenneTwister
and StableRNGs.StableRNG
- For this execrise I would recommend the latter (even though MersenneTwister is much more common to be used), thus run:
using StableRNGs
randn(StableRNG(1),100)
to get 100 random numbers.
Scale the random numbers to fullfill std(x) β 10
functionize it
Next wrap that code in a function simulate
which takes two arguments, a random seed and the number of samples
2. cumulative mean
Calculate the cumulative mean of a single simulation. save it to a variable
Note that there is no cummean
function, but clever element-wise division in combination with cumsum
should lead you there - or you just use a loop π€·
cumsum(x) ./ (1:length(x))
3. Plotting!
Now for your first plot. Use a scatter
plot1 to visualize the cummulative mean output, if you do not generate a Figure()
+ ax = f[1,1] = Axis(f)
manually, you can get it back by the scatter call. f,ax,s = scatter()
. This is helpful as we later want to extend the Axis
and Figure
with other plot elements
Use hlines!
to add a horizontal line at your βtrueβ value
4. Subplot
simulate repeatedly
Letβs simulate 1000x datasets, each with a different seed, and take the mean over all simulated values
An easy way to call a function many times is to broadcast it on an array created e.g. via 1:1000
- you could also use map
to do it, but I donβt think it is as clear :)
simulate.(1:1000,nmax)
Mean it
calculate the mean of each simulation
using Statistics
mean.(simulate.(1:1000,nmax))
# or
sum.(...) ./ nmax
Add it as a subplot
We want to add a histogram of the 1000 means to the plot.
- Add a new Axis to
f[1,2]
- use it to plot the histogram of the means via
hist!
- donβt forget to change thedirection=:x
to flip the histogram - link the axes using
linkaxes
5. Prettify it
There are some simple tricks to make a plot look nicer:
- remove the βboxβ using `hidespines!(ax,:r,:t)
- resize the right sub-plot to be smaller
colsize!
andRelative(X)
- hide the x-grid (type
ax.
+TAB
to find all possible attributes) - hide the
xlabels
+xticks
+bottomspine
from the right subplot - add two Labels
(A)
and(B)
to the plot - Bonus: use
color
to color the cummulative sum samples according to how many samples went into that sum.colormap=:Reds
looks good to me!
You can create a slightly fancier label by adding a circle around it :)
Label(f[1,2,TopLeft()],"B",padding=[0,0,5,0])
Label(f[1,2,TopLeft()],"β",padding=[0,0,8,0],fontsize=30)
Task 2: Interactivity!
Using the Pluto.jl
reactive backend, changing a value in some cell will automatically update all other cells - including plots.
We can use Sliders instead of fixing the parameters of the simulation
A slider is defined like this:
@bind yourVarName PlutoUI.Slider(from:to) # from:step:to is optional, step by def 1
Adding interactivity via sliders
- Define a slider that controls the number of samples from 1:500
- Define a second slider that adds a constant offset to all values of the simulation simulation
- make sure to fix the x/y-limits to get a nice looking plot :-)
After understanding the slightly awkward syntax, the following gives a nice collection of Sliders, Checkboxes, Widgets etc. with at the same time being drag-and-dropable and in a sidebar. Neat!
using PlutoExtras
BondTable([
PlutoExtras.@BondsList "Sliders" let
PlutoExtras."name A" = @bind(varA,PlutoUI.Slider(1:500))
"name B" = @bind(varB, PlutoUI.Slider(-5:5))
end
])
Task 3: AlgebraOfGraphics
For this task we need a dataset, and I choose the US EGG dataset for itβs simplicity for you.
to load the data, use the following code
using DataFrames, HTTP, CSV
# dataset via https://github.com/rfordatascience/tidytuesday/tree/master
= CSV.read(download("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-04-11/egg-production.csv"),DataFrame) df
If you dont like to use Pluto.jl, you can of course switch back to VSCode. Then you have to create a new environment and add the packages you use before.
π₯ vs. π
Visualize the number of eggs against the year
To get a first overview, first(df)
, describe(df)
and names(df)
are typically helpful
Split them up
Next split them up, choose color
and col
and choose reasonable columns from the dataset
Rotate the labels
Use the trick from the handout to modify a plot after it was generated: Rotate the x-label ticks by some 30Β°
:::callout-tip instead of rotating each axis manually, you can also replace the draw
command in your pipeline with an anonymous function. This allows you to specify additional arguments e.g. to the axis, for all βsubβ-plots
... |> x-> draw(x;axis=(;xlims = (-3,2)))
- 1
-
Note the
;
before xlims, this enforces that aNamedTuple
is created
Footnotes
after a
using CairoMakie
β©οΈ