Sterre
Issues to be discussed#
Introduction:#
I added more relevant facts on stock market participation during the financial crisis. Is this enough?
For me, I think it’s clear what you want to do.
I shortened my intro to make it a more floating story, is this a bit more what you suggested, or am I going the wrong way with this?
Literature review:#
First I changed the order around of the arguments, as I think you proposed.
Second, I made a table summarizing all the relevant literature and its influence on stock market participation, thank you for the tip.
Third, as I changed the layout around and put my hypothesis all the way at the end of the literature review I find myself in a difficult position to draw direct conclusions. This makes my story a bit less convincing I believe.
My suggestion is to draw my conclusions instead of all the way at the end, just right after each subsection. To summarize each paragraph and to state my approach, keeping the hypothesis all at the end. What do you think of this? Or do you maybe have another suggestion to make it more concise? The Table could just function as an extra summary.
Last, you made a comment on the participation puzzle in my draft on having to explain it. However, I explained the participation puzzle in the introduction already, do you think it is still necessary to also explain it in the literature review?
Methodology & Data:#
I set up my equation for the Probit regression which I will be conducting before the next deadline. I am very curious about what you think of this equation. Do you think it covers the changes in participation for each period?
I don’t think it is strictly correct: you can formulate probit as a latent variable model, but then you should use $y^*$ and not $y$ on the right-hand side of the equation. Furthermore, it does not generalize readily to a panel context, which you have to deal with. I provide some notation below which might (or might not) help.
I added to Tables with summary statistics, is this enough on the summary statistics? or do I need to include median values for example as well.
Also, for doing a probit regression I have to reduce outliers in the income and wealth variable. I wanted to turn these two variables into natural logarithms as many papers suggest. However, the negative wealth factors disappear creating some major gaps in my study. Is there another way of keeping those negative values? I was thinking of maybe getting for example the lowest outlier to the minimum and the largest outlier the maximum values. This keeps the same amount of observations but does reduce the outliers. What do you think?
Why do you think you have to delete outliers? I think it’s common practice in finance, but I don’t like the idea of deleting (valid) data at all.. What you could use to keep all observations is $\log(\text{Wealth} - \min_i \text{Wealth})$. So that all observations are kept. It is also common to use $\log (1+\text{Wealth})$.
Hypotheses#
H1 : In financial uncertain times the overall stock market participation drops.
I think this hypothesis is a nice starting point, but it could be elaborate: it seems that testing this hypothesis is equivalent to enunciating a fact. Maybe the added value might not be so large if you don’t also investigate the possibility of heterogeneous treatment effects, i.e., a certain increase in uncertainty might impact the likelihood of stock market participation differently for different individuals. It can relate to demographic variables, but also to whether they were already active on the stock market in a previous period or not.
H2 : More sophisticated, higher income, and, wealthier households are more likely to stay in the stock market during financial uncertain times.
Note that this hypothesis is conditional on already participating in the stock market. So in that sense, you could reframe your hypotheses, saying that you expect the same relationships as in hypothesis 1 to hold, conditionally on individuals already participating in the stock market.
H3: Stock market participation movements are persistent after economic uncertain time,
This hypothesis seems a little vague to me
Methodology#
- You have a panel data set
I think you can just use OLS (with robust standard errors). It is also possible, but more difficult to implement a panel (FE) estimator in a probit framework. (Stata: xtprobit, R: pglm )
Model:
$$ P(Y=1) = F(\alpha_i + \beta \cdot \text{Uncertainty}{it} + \gamma \cdot \text{Demographics}{it} + \delta \cdot \text{Uncertainty x Demographics} + \epsilon_{it}) $$ Where $Y=1$ if individual $i$ participates on the stock market at time $t$, and 0 otherwise (H1).
For hypothesis 2, you employ essentially the same model, but condition the data on individuals who participate in the stock market at time $t-1$.
-
Have you thought about attrition of the panel data?
-
Might be worthwhile to look into hazard models (risk of participating, or risk of not-participating) in relation to hypothesis 3.
-
Can you match the index (on a monthly basis) to the time at which individuals have taken the survey?