Luc

Introduction#

  • Nice structure, start with general idea: market efficiency

  • Then, deviations from market efficiency, and pose the research question

In order to test the relevant hypotheses, it is crucial to design an appropriate and valid methodology. For this particular instance, the Fama-French 3-factor (FF3) model will be utilized as inspiration (Fama & French, 1992). Utilizing the FF3 model to measure mispricing, an additional mispricing factor will be computed and applied.

This paragraph is maybe too much detail. I think I would rather like to see your own approach: you estimate (one or several) asset pricing model(s), and you attempt to explain the residual of returns that remains after correcting for systematic factors.

Literature#

The structure also seems clear to me.

It is interesting to note that these mispricing factors can be utilized as control variables in order to remove noisy estimates and more accurately establish a relationship between more novel anomalies and mispricing, which can be extremely useful for this particular investigation into the relationship between anomalous returns and hype, in turn classifying hype as an anomaly in and of itself.

In this case, I think it would be better to more precisely describe the methodology of Stambaugh and Yuan in terms of what they are doing, than to summarize the implications for your own research. I think the implications should be there, but in the Methodology section.

Mispricing cannot always be explained by Bayesian models or factors, and always depends on some market uncertainty or irrationality.

This is very ad-hoc/confusing. Where do Bayesian models come from in the discussion?

Planning to put the asset pricing models in a table to make it more clean and easier to read

This is a good idea. I also suspect that readers are quite familiar with these models, so you don’t need to describe them into depth.

Hypotheses#

Firstly, the put-call ratio of each stock will be utilized as a measure of investor-sentiment

Seems good to me. But why can you not also use Google trends or something similar?

Methodology#

The combined variable of hype will be taken as a natural log in order to generalize its output and make it readily comparable with both the size and value premiums, this will be further elaborated upon under section 3.1.3.

It is okay to combine them, but you can also just include all variables in the regression.

The exposition for the methodology is not super clear ; I think you should use more equations in the part where you didn’t, and less equations in the part where you did!

Why do you need the portfolio to be the unit of analysis instead of the unique stock?

Questions#

What time-frame to select. I think it’s incredibly difficult to select a certain time-frame, and relevant literature uses between 5-30 years. Is there maybe a specific rule for this?

I think anything between 5-15 years is suitable. It is largely arbitrary, so you can just let yourself be guided by data availability.

The US CRSP small and large cap indices are rather large, how should I go about choosing which stocks to include in my analysis from them as the benchmark? Or are1300 stocks per index acceptable/doable?

I think this is acceptable, but I also would agree if you would take the S&P 500.

How to test all hypothesis: should I run only one Fama-French regression with all different variables having different factors (not just PMN as a combined factor, but rather taking a factor for Trading Volume, Investor Sentiment, and Business Press Sentiment individually?). I think this can be a big challenge

I don’t get how you would test your hypotheses yet - my guess would just be, calculate the expected returns for stock $i$ at time $t$ , and try to explain the residuals in a regression containing Hype + control variables.