Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Viewable reports #357

Merged
merged 4 commits into from
Nov 20, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions lametro/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,34 @@ def get_last_action_date(self):
def topics(self):
return [s.subject for s in self.subjects.all()]

@property
def is_viewable(self):
'''
Sometimes, a Bill may be imported to Councilmatic, though it should not be visible to the public.
This issue summarizes why that might happen: https://github.com/datamade/la-metro-councilmatic/issues/345#issuecomment-421184826
Metro staff devised three checks for knowing when to hide or show a report:

(1) Is the view restricted, i.e., is `MatterRestrictViewViaWeb` set to True in the Legistar API? We skip these bills further upstream. https://github.com/opencivicdata/scrapers-us-municipal/pull/251

(2) Does the Bill have a classification of "Board Box"? Then, show it.

(3) Is the Bill on a published agenda, i.e., an event with the status of "passed" or "cancelled"? Then, show it.

This property coms into play when filtering the SearchQuerySet object in LAMetroCouncilmaticSearchForm (views.py).
Note: we filter the sqs object, rather than remove "hidden" bills from the Solr index.
This strategy minimizes complexity (e.g., attempting to implement an LAMetroBillManager),
and this strategy avoids making adjustments to the `data_integrity` script (https://github.com/datamade/django-councilmatic/blob/master/councilmatic_core/management/commands/data_integrity.py)
'''
if self.bill_type == "Board Box":
return True

events_with_bill = Event.objects.filter(agenda_items__bill_id=self.ocd_id)
passed_events = [event for event in events_with_bill if (event.status == 'passed' or event.status == 'cancelled')]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this array ever become very large? since we only care about the presence, but not how many, etc., it may be more memory-efficient to do something like:

for event in events_with_bill:
    if event.status in ('passed', 'cancelled'):
        return True
return False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this list will never be very large – a bill may be associated with one or two events. More than that does not align with what Metro has done historically.

if passed_events:
return True

return False

class LAMetroPost(Post):

class Meta:
Expand Down
4 changes: 4 additions & 0 deletions lametro/search_indexes.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
class LAMetroBillIndex(BillIndex, indexes.Indexable):
topics = indexes.MultiValueField(faceted=True)
attachment_text = indexes.CharField()
viewable = indexes.BooleanField()

def get_model(self):
return LAMetroBill
Expand Down Expand Up @@ -44,3 +45,6 @@ def prepare_topics(self, obj):

def prepare_attachment_text(self, obj):
return ' '.join(d.full_text for d in obj.documents.all() if d.full_text)

def prepare_viewable(self, obj):
return obj.is_viewable
2 changes: 2 additions & 0 deletions lametro/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -687,6 +687,8 @@ def search(self):
# Don't auto-escape my query! https://django-haystack.readthedocs.io/en/v2.4.1/searchqueryset_api.html#SearchQuerySet.filter
sqs = sqs.filter_or(attachment_text=Raw(self.cleaned_data['q']))

sqs = sqs.filter(viewable=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we always hide bills, should we index them at all?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i.e., do we have all the info we need to omit them when we build the index?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i forget, does our current pipeline update the index with every scrape, or with enough frequency for this suggestion to make sense?)

Copy link
Contributor Author

@reginafcompton reginafcompton Oct 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked myself this very question!

I decided to index "hidden" bills for one reason – the integrity script, which checks that the Councilmatic database and the Solr index have the same number of bills. If we do not index a "hidden" bill, then this script will complain. I think it's easier to modify the SearchQuerySet, rather than find a workaround with how the integrity script or import_data works.

(N.B. data_integrity runs after every execution of update_index, which runs after import_data which runs four times per hour.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha, great minds. :-) that makes sense to me. maybe it would be good a drop a comment or docstring in the search_indexes.py file?


return sqs


Expand Down
2 changes: 2 additions & 0 deletions solr_scripts/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,8 @@

<field name="identifier" type="text_en" indexed="true" stored="true" multiValued="false" />

<field name="viewable" type="boolean" indexed="true" stored="false" multiValued="false" />

</fields>

<!-- field to use to determine and enforce document uniqueness. -->
Expand Down
30 changes: 26 additions & 4 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,8 @@

from django.core.management import call_command

from councilmatic_core.models import EventDocument, Bill, Membership
from lametro.models import LAMetroPerson, LAMetroEvent, LAMetroOrganization

from councilmatic_core.models import EventDocument, Bill, EventAgendaItem, Membership
from lametro.models import LAMetroPerson, LAMetroEvent, LAMetroBill, LAMetroOrganization

def get_uid_chunk(uid=None):
'''
Expand Down Expand Up @@ -37,7 +36,7 @@ def build(self, **kwargs):

bill_info.update(kwargs)

bill = Bill.objects.create(**bill_info)
bill = LAMetroBill.objects.create(**bill_info)
bill.save()

return bill
Expand Down Expand Up @@ -68,6 +67,29 @@ def build(self, **kwargs):

return EventFactory()

@pytest.fixture
@pytest.mark.django_db
def event_agenda_item(db, event):
class EventAgendaItemFactory():
def build(self, **kwargs):
named_event = event.build()

event_agenda_item_info = {
'event_id': named_event.ocd_id,
'updated_at': '2017-05-27 11:10:46.574-05',
'order': 1,
}

event_agenda_item_info.update(kwargs)

event_agenda_item = EventAgendaItem.objects.create(**event_agenda_item_info)
event_agenda_item.save()

return event_agenda_item

return EventAgendaItemFactory()


@pytest.fixture
@pytest.mark.django_db
def event_document(db):
Expand Down
32 changes: 31 additions & 1 deletion tests/test_bills.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

from django.core.urlresolvers import reverse

from councilmatic_core.models import Bill, RelatedBill
from councilmatic_core.models import Bill, RelatedBill, Event
from lametro.utils import format_full_text, parse_subject

# This collection of tests checks the functionality of Bill-specific views, helper functions, and relations.
Expand Down Expand Up @@ -59,5 +59,35 @@ def test_format_full_text(bill, text, subject):

assert format_full_text(full_text) == subject

@pytest.mark.parametrize('bill_type,event_status,assertion', [
('Board Box', 'passed', True),
('Resolution', 'passed', True),
('Resolution', 'cancelled', True),
('Resolution', 'confirmed', False),
])
def test_viewable_bill(bill,
event_agenda_item,
bill_type,
event_status,
assertion):
bill_info = {
'bill_type': bill_type,
}
bill = bill.build(**bill_info)
bill.refresh_from_db()

event_agenda_item_info = {
'bill_id': bill.ocd_id,
}
item = event_agenda_item.build(**event_agenda_item_info)
item.refresh_from_db()

event = Event.objects.get(ocd_id=item.event_id)
event.status = event_status
event.save()
event.refresh_from_db()

assert bill.is_viewable == assertion