-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WTP Space, GMNL, and Starting Values #22
Comments
Hi @chevrotin, you can pass initial values to xlogit using the varnames = ["a", "b", "c"]
ml = MixedLogit()
ml.fit(X=df[varnames],
y=df["choice"],
varnames=varnames,
ids=df["chid"],
randvars={"a": "n", "b": "n", "c "n"}
...,
init_coeff=np.array([.1, .1, .1, .1, .1, .1, .1]) # Order -> [a, b, c, sd.a, sd.b, sd.c, scale_factor]
) Regarding your other question, the current version of xlogit does not allow the |
Thanks Cristian. The code works. I managed to estimate a WTP-space model with my data. Providing starting values, derived from a preference space ML model, seems to do the trick. Few things to note. The model works with lower number of draws (4000), however, it runs into memory problem (similar to our discussion in logitr) when I specified 10000. I have 8 random variables. I imagine it would become much more constraining when you allow correlated parameters in future versions. I wonder if cupy allows for some degree of memory management, where it can pass on some of the memory burden to hard drive. Essentially, you have a rocket here. The estimation only took less than 15 minutes. We have plenty of room to dial the speed down to allow for a larger model. error message: cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 2,691,840,000 bytes (allocated so far: 32,640,581,120 bytes). |
I think the batch processing functionality in xlogit can fix your GPU OutOfMemoryError. Simply pass the Regarding the offloading to disk, this is feasible but requires a separate implementation and it would be quite slow. However, the batch processing functionality should solve most of the issues, and the only limitation would be the computer RAM. Note that 16GB of RAM were enough for half-million draws, so this should not be that big of a constraint. |
Ah.. thanks. Glad that you have already solved that. Wasn't reading careful enough. Will try the batch function. |
Sure, please let me know how it works. Unfortunately xlogit's documentation does not properly elaborate on the batch processing and WTP space models. I definitely need to spend some time expanding the documentation to better highlight some of the nice features implemented into xlogit. |
If you need a helping hand for documentation, I'll be glad to help. |
That would be actually fantastic and I would love your help. If you can create some examples for WTP space models in a Google Colab notebook, I can directly integrate such notebook into the documentation. I'd would be great if one example includes passing initial values to the estimation using the |
Alright.. Will do |
Thanks a lot @chevrotin! |
https://colab.research.google.com/drive/1bjuh4-oAcsz5m9rN4uYnmQtqYPeC-Vwf?usp=sharing @arteagac Please see the examples of both starting values and using batch_size to estimate with ndraws = 50000. Note that the results differ a bit to LOGITR. I think this might be because of the multiple starting values that John has implemented. The log-likelihood score suggest that the solution might be a local maxima. |
Hi @chevrotin, Dear @jhelvy, Can you please provide me some insights on how your multi-start approach samples the initial values? What upper and lower bounds do you use for the random sampling of initial values? I am trying to figure out why logitr yields a slightly better final log-likelihood despite using lower number of draws. Any ideas of why this might be happening? Is it perhaps the multi-start? Do you maybe use Sobol draws by default? I don't think the cause is the optimization routine because we both us Limited Memory BFGS and similar tolerance values (>= 1e-6). |
Cool thread here folks. A couple things:
This is also why I went through the trouble of building in a multi-start loop. I had enough convergence issues with WTP space models that I didn't want users to have to do this much work - just let it iterate more to find a better solution. But this is also true for mixed logit in the preference space. Mixed logit models have non-convex log-likelihood functions, so really we should be running them from different starting points to be a little more confident in the solutions. |
Thanks a lot for your detailed explanation, @jhelvy. I will check the heuristics included in your multi-start loop. Given the importance of the multi-start functionality, I might include it in xlogit. |
Glad to hear from you, John. Here are some of my thoughts: @jhelvy It's true that GMNL is not necessary for WTP-space. The plain ML can do it. In my own implementations, GMNL tend to converge better. To avoid the long tail, you can use the draws of truncated normal (+- 2) as the log normal price's underlying normal distribution. @arteagac : I'm kinda curious about xlogit implementation of WTP Space. I take it that the scale parameter is assumed as fixed, therefore it does not have a standard deviation associated with it. Or does it? Also, does it automatically take the negative of price, (when scale_parameter = price) ? Merry Christmas to both of you, Cristian and John. |
Hi @chevrotin, you are correct, in xlogit the scale parameter is fixed and does not have an associated standard deviation. Yes, xlogit uses the negative of the scale parameter, as it follows the WTP space formulation shown below (see |
That also explains the lower ll score from xlogit. Thanks 👍👍
…On Sat, Dec 23, 2023, 2:42 PM Cristian Arteaga ***@***.***> wrote:
Hi @chevrotin <https://github.com/chevrotin>, you are correct, in xlogit
the scale parameter is fixed and does not have an associated standard
deviation. Yes, xlogit uses the negative of the scale parameter, as it
follows the WTP space formulation shown below (see $p_{nj})$, which was
kindly provided by @jhelvy <https://github.com/jhelvy> .
image.png (view on web)
<https://github.com/arteagac/xlogit/assets/13966395/cf0f8c0a-2925-449c-8e6f-5f5fc867de66>
—
Reply to this email directly, view it on GitHub
<#22 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZQNYBW4RNGW5JKYMZI4WLLYK4XZFAVCNFSM6AAAAABA3JVNCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRYGM2TQNBRGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@chevrotin I don't have a way to allow bounds on the underlying normal for a log-normal parameter. I just use But I'm curious how you've been able to implement GMNL for WTP space models? Do you use custom code? I've struggled to get it to converge for a WTP space setup using alternatives like {gmnl} and {apollo} (see, for example, this article). WTP space models are just weird, and in my experience there haven't been many developers who have made them easy to estimate. Other than {logitr}, there's a routine in Stata that seems pretty robust, and of course there's xlogit. I guess you could go Bayesian with Stan too and set this up, but at that point you're doing a lot of coding yourself and you really, really need to know what you're doing. I wanted WTP space models to be easily accessible to researchers who aren't programming experts, so I've tried to make the default settings in {logitr} work well for most. |
@jhelvy I use the GMNL estimator with STATA. It usually take days to estimate. You are right there isn't many routines that are fast and complete. |
Days!? 😲 Wow...I have no idea why. For licensed software you think they'd spend some time doing a little performance engineering. I don't have gmnl on the list to support in logitr though, there are just too many other things to add right now that are more pressing, like maybe support for xlogit. |
Yeap. Imagine at the 178 iteration, "flat region is encountered." No result
is provided. Multicore and GPU optimizations are limited in STATA and
NLOGIT.
GMNL is not everybody's cup of tea. Some are outright dismissive about its
usefulness. Where it shines is when we expect the distribution to be top-
and bottom- heavy, or as Fiebig call it "lexicographic" preference. For
some unknown reasons, GMNL always outperform ML in STATA, in my experience,
for WTP Space models. It may be because people tend to be extreme in their
stated preferences in surveys.
Nevertheless, if one is after the newest and shiniest, Train's mixed-mixed
logit seems to be the most rewarding at this point.
I agree about the priorities. It would be tremendous just to get a fast ML
with correlated preference and WTP-space ready. More hands make works
light, as you have shown here. I am excited about the potential here.
…On Sat, Dec 30, 2023 at 5:47 AM John Helveston ***@***.***> wrote:
Days!? 😲
Wow...I have no idea why. For licensed software you think they'd spend
some time doing a little performance engineering. I don't have gmnl on the
list to support in logitr though, there are just too many other things to
add right now that are more pressing, like maybe support for xlogit.
—
Reply to this email directly, view it on GitHub
<#22 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZQNYBSFWXGHFHEXA6UNTILYL7WLNAVCNFSM6AAAAABA3JVNCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZSGUYDGMRSG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Dear @chevrotin, Thank you again for the time you dedicated to develop this example. Regarding the slight difference with logitr's final log-likelihood, I don't think it is caused by the fixed scale parameter. I think logitr also uses by default a fixed scale parameter, which you can change to random using the Dear @jhelvy,
|
Yes, logitr uses a fixed scale parameter by default. It's only random if the user specifies so using In |
Thanks a lot for the clarification, John! |
Thanks for the awesome package. Cristian.
For implementation of WTP Space models, having reasonable starting values may be critical. My attempt to estimate my own data resulted in convergence issue: " Nonpositive definiteness in Cholesky factorization in formt; refresh the lbfgs memory and restart the iteration." The log likelihood is poor, indicating that the algorithm was stuck somewhere.
My intuition is that with proper starting values, the issues of local maxima or flat regions can be tackled. Is it possible to provide starting values to the procedure?
Further, Greene and Hensher (2010; Trasnportation Part D) and Fiebig et al. (2010; Marketing Science) have shown that WTP Space can be implemented with GMNL. The advantage of that is that the price parameter follows a log normal distribution. I wonder how the present set up of xlogit compares to/allow this specification.
Again, thanks for the awesome package. Xlogit is unrivalled in speed. It would be great to complete it with more useful features. This is a lot to ask. I would help if I can overcome the learning curve of programming it.
The text was updated successfully, but these errors were encountered: