Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for function-based DEFAULT values, not only literals #36

Closed
tsionyx opened this issue May 29, 2017 · 11 comments
Closed

Support for function-based DEFAULT values, not only literals #36

tsionyx opened this issue May 29, 2017 · 11 comments

Comments

@tsionyx
Copy link
Contributor

tsionyx commented May 29, 2017

I'd like to create this simple table:

CREATE TABLE foo
(
    date Date DEFAULT now(), 
    a UInt8
) ENGINE = MergeTree(date, a, 8192)

so I use the following:

class Foo(models.Model):
    engine = engines.MergeTree('date', ['a'])
    date = fields.DateField(default="now()")
    a = fields.UInt8Field()

and get the error:

...
     return escape(value.isoformat(), quote)
TypeError: 'str' object is not callable

Even if I changed to this

class Foo(models.Model):
    engine = engines.MergeTree('date', ['a'])
    date = fields.DateField(default=type('', (), {'isoformat': lambda: "now()"}))
    a = fields.UInt8Field()

I still get the error (due to quoting):

infi.clickhouse_orm.database.DatabaseException: Code: 38, e.displayText() = DB::Exception: Cannot parse date now(): Cannot parse Date from String, e.what() = DB::Exception

How can I achieve the desired behaviour?

@tsionyx tsionyx closed this as completed May 29, 2017
@tsionyx tsionyx reopened this May 29, 2017
@ishirav
Copy link
Contributor

ishirav commented May 29, 2017

Off the top of my head, we'll probably need to add a Func class that marks the default value as a function instead of a literal value. So you would use it like this:
date = fields.DateField(default=Func("now()"))
This might even be useful when building queries, allowing you to compare a field's value to dynamic values - e.g.
Foo.objects_in(db).filter(date__lt=Func("now()"))

@VeryaskinMax
Copy link

VeryaskinMax commented Oct 29, 2019

Is this issue still in progress?
It is very needed to we make ability to write like this, when we create a table model:

event_time = fields.DateTimeField()
event_date = fields.DateField(default="toDate(event_time)")

With above i get error ValueError: time data 'toDate(event_time)' does not match format '%Y-%m-%d'. But i need to fill field event_date from event_time on fly.

@ishirav
Copy link
Contributor

ishirav commented Oct 29, 2019

Yes, I'm actively working on this for v2.

For the use case you described, try using "materialized" instead of "default". It should work, and would also prevent the event_date and event_time from being mismatched (since event_date will always be calculated from event_time, you won't be able to override its value by mistake).

@VeryaskinMax
Copy link

Yes, i thought about materialized field. It may help. But will be waiting for upgraded "default" option feature =)

@ishirav
Copy link
Contributor

ishirav commented Dec 13, 2019

While working on this feature I ran into a showstopper that probably makes it impossible to support functions as default values.

Consider the Foo model above, with fields date and a. We can try inserting two instances of this model to the database:

instance1 = Foo(date='2019-12-31', a=17)
instance2 = Foo(a=18) # use the default value for the date column
database.insert([instance1, instance2])

This generates an SQL statement more or less like this:

INSERT INTO foo (date, a) FORMAT TabSeparated

and then the field values are sent as tab-separated strings:

2019-12-31    17
<something>   18

The <something> is the problem - what value can we put there to tell ClickHouse to use the default expression? Sending null (\N) or an empty string does not work, so I don't think there's a way to do this. The default expression will be used only if the date column is not sent at all in the INSERT statement.

@VeryaskinMax
Copy link

I saw this feature in one php project, which written with Symfony and uses Annotations (if you know these things). Look at the code:

    /**
     * @var string
     * @CHType(type="Date DEFAULT toDate(today())", behavior="SKIP_PROPERTY")
     */
    protected $date = self::DATE_DEFAULT;

It describes one calculated field of a table. If we divide a params into a two parts and translate it into python, we could write like this:

date = fields.DateField(default='toDate(today())', skip_property=True)

And then we should check this param, and if property 'skip_property' equal 'True', then remove it from INSERT statement.

@ishirav
Copy link
Contributor

ishirav commented Dec 17, 2019

@VeryaskinMax - the problem is not identifying that there's a default value, its generating the correct INSERT statement. If you are inserting two records, one with a default value and one without, you need two separate INSERT statements (since the list of fields to insert is different between the two records).
Alternatively, you can use a single INSERT with a value that tells ClickHouse "this is not a real value here, just use the default instead".
It seems that it should possible to do when inserting with CSV or TSKV formats, but I couldn't find a way to make it work. See ClickHouse/ClickHouse#5653

@VeryaskinMax
Copy link

VeryaskinMax commented Dec 17, 2019

@ishirav, maybe I'm wrong, but if in the Model in the property (e.g. date above) you will have a param skip_property script should skip this property anyway, even if it's value is set. Or it should raise an error if a user try to set a value for a property which has a param skip_property=True.
In this case the list of fields to insert will be identical.

@ishirav
Copy link
Contributor

ishirav commented Dec 17, 2019

I'm not sure I understand what you're saying, but I think you're mixing between DEFAULT and MATERIALIZED.

  • When a field has a DEFAULT value, you can give it a value or you can leave it empty - in which case the db will calculate the value to use based on the DEFAULT expression.
  • MATERIALIZED means that you cannot give a value to the field: it always gets its value from the MATERIALIZED expression. You cannot insert anything to this field.

@aopetrov86
Copy link

aopetrov86 commented Jan 17, 2020

Have exactly same problem:
Target table:

CREATE TABLE content
(
    dt                   DateTime,
    date               Date default toDate(dt)
) 
ENGINE = MergeTree() 
PARTITION BY toYYYYMM(date) 
ORDER BY (date)

Model definition:

from infi.clickhouse_orm import fields, models, engines, migrations
class ClickhouseModel(models.Model):
    dt = fields.DateTimeField()
    date = fields.DateField(default='toDate(dt)')
    
    engine = engines.MergeTree(partition_key=('toYYYYMM(dt)',), order_by=('date',))

On migrate:

ValueError: time data 'toDate(dt)' does not match format '%Y-%m-%d'

@ishirav
Copy link
Contributor

ishirav commented May 29, 2020

Available in v2.0.0

@ishirav ishirav closed this as completed May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants